=========================================================================== PAM 2018 Review #19A --------------------------------------------------------------------------- Paper #19: Fury Route: Leveraging CDNs to Remotely Measure Network Distance --------------------------------------------------------------------------- Overall merit: 2. Should not appear (Weak paper, problems that cannot be addressed) Reviewer expertise: 2. Some familiarity Novelty: 2. Incremental improvement or reappraisal Writing quality: 4. Well-written ===== Paper summary ===== This paper presents FuryRoute, a method to externally estimate the relative distance between two nodes in the Internet by leveraging major CDNs ECS support. The authors validate the method and the results obtained using PlanetLab, iPlane and RipeAtlas data. ===== Reasons to accept ===== - The paper presents a quite interesting idea. - It leverages existing technology to measure relative distance between nodes. ===== Reasons to reject ===== - The notion of "distance" is not clearly defined. - The method relies on the efficiency of the replica selection provided by major CDNs (CDNs may not only take into account topological distance for that) and their accuracy to map the nearest replica to the user. - The evaluation does not provide sufficient arguments to demonstrate its validity. ===== Comments for authors ===== I like the idea of using existing technologies to infer the relative distance between nodes in the Internet. The problem this project is trying to solve has practical and operational uses. However, this reviewer believes that this paper needs more work. It needs to provide more convincing arguments for demonstrating its validity and accuracy in different scenarios and at scale. Additional comments: - What are the "units" of the relative distance that you can obtain with FuryRoute between two nodes? Are they kms, hops? - Is a path of CDN replicas representative of the actual network topology and distance? Can you compare this path with the one obtained with traceroutes and other methods? How can you take into account the actual distance between the end user IP and the CDN? CDN coverage may also vary between regions in the world. - It would have been interesting to discuss the impact of DNS load-balancing employed by some CDN providers: maybe that would degrade the quality of the responses (length-wise)? What would happen for CDN providers that also take into account congestion and other performance metrics in addition to the location of the user? You are are assuming that CDNs are efficiently and that you have accurate information about their actual location. - Are the results provided by previous research efforts referenced by the authors (e.g., Gummadi) relevant as of today? Have you considered replicating the methods that they proposed and compare their accuracy with FuryRoute in the same settings and conditions? - Some readers may wonder why Akamai, one of the largest CDN providers, is not used in this method. It would be nice to clearly explain the technical reasons. - Section 2.1: It is not clear enough what you are querying for. There is a mention of "25 globally distributed addresses", but there's no mention of the actual subnet, which would clearly help to understand the results in the CDF. Further, which providers were collected from the Alexa scraper and how did you distinguish CDN providers from other domains and third-party services? - Is Ref 13 incorrect? I would say the correct one is something like [1]. According to this, just Edgecast, and CDN77 support ends-client-subnet. - What is the overhead that this method adds to DNS resolvers and CDN providers? How many - It is unclear what happens in the voting procedure if more than one provider gets the same amount of vote: is the candidate randomly chosen? Would not that impact the overall quality of the constructed chain? - Does the method performance and accuracy vary for different types and usages of IP addresses (e.g., data center, residential, mobile, ..) ? It is unclear what type of IP addresses the authors used for the evaluation, but it seems like most of them will be residential networks and academic/education networks (i.e., RIPE and PlanetLab nodes) - The authors compare the performance of Fury route with active latency measurements, finding out that the chains found by their tool perform better in half of the cases. Ping's RTT may not be the best metric to use in that case, since the path used by the ICMP packets may be asymmetric, especially in an interdomain context. - Section 3. Chain length. Why is the chain length downgraded to /24 when its scope is larger than 24 (e.g. 32)? This choice is not well justified. If it is used to estimate network distances, isn't it useful information? - Section 5.2. Comparison to iPlane. Is the completion rate 56% for iPlane? If this is true, why the accuracy is equal to the solution proposed if it has 80%? Although both match in terms of origins, there is a high difference in completion rate. - I missed an honest discussion about FuryRoute limitations. [1] https://www.cdnplanet.com/blog/which-cdns-support-edns-client-subnet/ =========================================================================== PAM 2018 Review #19B --------------------------------------------------------------------------- Paper #19: Fury Route: Leveraging CDNs to Remotely Measure Network Distance --------------------------------------------------------------------------- Overall merit: 3. Should appear (Good paper, minor issues that can be fixed) Reviewer expertise: 4. Expert Novelty: 2. Incremental improvement or reappraisal Writing quality: 4. Well-written ===== Paper summary ===== The paper develops a method for estimating network proximity between arbitrary hosts based on CDN mapping. The proposed technique leverages overlaps between sets of replicas returned by DNS for consecutive EDNS0 queries to construct a CDN chain between arbitrary hosts. The results show a correspondence between CDN chain length and quality, judged by EDNS0 subnet mask responses, with direct round trip time measurements. ===== Reasons to accept ===== + a novel use of EDNS0 to build CDN chains + interesting use of EDNS0 responses to estimage CDN chain quality + a useful approach to CDN selection for different functions based on the granularity of replica deployment/mapping ===== Reasons to reject ===== - the paper did not include similar previous work on using CDN infrastructure for latency estimation - not clear that the proposed solution works for hosts that are far away from each other ===== Comments for authors ===== To start with, I think that your used of EDNS0 to identify consecutive sets of DNS servers and form a chain, and to then judge the quality of the chain is quite nice! Your validation of the technique is also solid with the use of both Atlas and PlanetLab to obtain ground-truth measurements for your rankings. However, the paper in its current form has two shortcomings. First, the idea of using CDNs as paths for network distance prediction has been explored before in http://ieeexplore.ieee.org/document/7288449/. Your technique of identifying these chains is different (and quite innovative!), but it's not clear how it compares with respect to performance, cost, and ease of deployment. Similarly to your work, that paper eliminates the need for measurement infrastructure for network distance estimation and shows that path through CDN replicas correlates with latency between hosts. Second, your mechanism is not able form a chain between hosts that are far enough from each other that no CDN returns votes for next hop selection. This is similar to the limitation of CRP, when two hosts have no CDNs in common. From your evaluations it seems that it does not happen in practice too often, but from an algorithmic perspective, your approach does not guarantee a solution. Is there some fail-safe mechanism that could be used? I'm also curious about how long it takes to estimate distance between clients using your system. You graph the number of queries, but not the time. For gaming systems that need to find a small number of nearby clients from a large set of candidates measurement time to different numbers of clients would be of interest. A couple spelling errors: request.i -> request. thsis -> this =========================================================================== PAM 2018 Review #19C --------------------------------------------------------------------------- Paper #19: Fury Route: Leveraging CDNs to Remotely Measure Network Distance --------------------------------------------------------------------------- Overall merit: 3. Should appear (Good paper, minor issues that can be fixed) Reviewer expertise: 2. Some familiarity Novelty: 2. Incremental improvement or reappraisal Writing quality: 3. Adequate ===== Paper summary ===== This paper presents Fury Route, a mechanism for inferring network distance between two hosts that exploits the underlying client mapping done by the CDNs using ECS. In particular, the mechanism constructs chains of responses and leverages this to estimate the distance between the hosts. The authors evaluate the mechanism and show that it matches the accuracy of other systems (e.g., iPlane). ===== Reasons to accept ===== -neat approach to repurpose the main function of CDNs ===== Reasons to reject ===== -Unclear how would it perform in regions with few CDNs -Unclear what would be the incentive for CDNs to let their infrastructure be used for other purposes than their own. -what if replicas are chosen for economic rather than performance reasons? ===== Comments for authors ===== I find the idea neat: repurposing a technology that accounts for distance to improve performance to reveal network distance. There are however a number of issues unclear to me. -Is the ground truth used in the evaluation geographical? I would expect an evaluation that show by how much Fury Route outperforms alternatives when trying to infer the known coordinates of a host. -How well would Fury Route fare as replica availability decreases? Or when redirectioning is for costs rather than performance? Overall, I quite like the idea, though I am not sure that a sufficient work to explore the shortcomings has been presented here. An interesting approach would be to study the divergence between active based measurements and Fury Route and try inferring the underlying reasons. =========================================================================== PAM 2018 Review #19D --------------------------------------------------------------------------- Paper #19: Fury Route: Leveraging CDNs to Remotely Measure Network Distance --------------------------------------------------------------------------- Overall merit: 3. Should appear (Good paper, minor issues that can be fixed) Reviewer expertise: 3. Knowledgeable Novelty: 3. New contribution Writing quality: 4. Well-written ===== Paper summary ===== This paper presents a system called Fury Route, an infrastructure independent method for estimating network distance between two hosts. Fury Route attempts to construct a virtual path of CDN replicas between a source and destination by using EDNS client-subnet-prefix. ===== Reasons to accept ===== Novel use of the EDNS client-subnet-prefix mechanism ===== Reasons to reject ===== There isn't much motivation for the work except that King [20] did it. The paper doesn't spend any time talking about how or why Fury Route would be used in practice. ===== Comments for authors ===== The evaluation is much improved over previous versions of the paper. I appreciate the time that must have gone into performing an evaluation with iPlane. One thing missing from the text for me is a practical application of this that isn't addressed by some simpler mechanism. For example, estimating the latency between a source and destination by the distance between them according to geolocation. Does Fury Route perform better than this straw man? CDNs do a good job of directing clients, real end-users, to nearby replicas but Fury Route assumes that CDNs will do a good job of server IP space on the Internet which CDNs don't normally serve. For instance, how good a job do you think Akamai will do in directing a random Google replica IP to a nearby Akamai replica, given that it never fetched any content from Akamai and so Akamai has no to little data about it. Do you think this has a negative impact on your results? I think it would be helpful you if explain the some intuition as to why a chain of CDN replicas would correspond to network distance.