Attribute |
Value |
Provide a short summary of the paper |
This is a paper that is primarily a measurement paper
giving an assessment of how good a job Akamai redirection
solution does, when evaluated in terms of picking lower
latency network paths. In carrying out this investigation, the
paper provides an interesting dissection of the behavior and
characteristics of Akamai's redirection scheme and how it
varies across geographies and proximity of clients to Akamai
servers.
The paper's secondary focus (less than a 1/3rd
of the pages), although it represents the initial motivation
for the work, is an investigation on whether and how it is
possible to use information derived from Akamai's redirection
choices to build a large-scale one-hop overlay network.
The motivation for this approach is that realizing
such a solution calls for either performing extensive
measurements to be able to continuously select the best
(one-hop overlay or direct) path among many possible choices,
or for relying on some other means for deciding ahead of time
what alternate paths might be good choices. The paper's
premise, which it supports to some extent through the
evaluation of Section 5, is that it is possible to leverage
Akamai's extensive measurement infra-structure, and its
results as available through its redirection recommendations,
to achieve this goal. |
What is the strength of the paper? (1-3 sentences) |
The paper's main strength is really in its extensive
investigation of the behavior and performance of Akamai's
measurement infrastructure and how it is used to make
redirection decisions.
This part alone is of interest
even if it might be a better fit for a conference such as IMC
than SIGCOMM. |
What is the weakness of the paper? (1-3 sentences) |
The paper's weakness is really in that it leaves quite a
few loose ends in its attempt at convincing us that using
Akamai's redirection information can provide a viable solution
to the choice of good alternate, one-hop overlay
paths.
The weakness is not only because the benefits of
using Akamai's redirection results exhibit wide variability in
the benefits it yields (as per the statistics provided in the
paper), but also because tying it to the route selection
problem is not done very convincingly. In particular, as the
authors point out at the end, Akamai's information only give
insight into one half of the path, and it is far from clear
whether that always (or even in a majority of cases) results
in a good overall choice (from source to destination and
back). I would really have liked to see more data on this in
order to be convinced that this was even a good idea.
Unfortunately, the routing evaluation section (section 5) is
relatively succint and not as thorough as the rest of the
paper.
Another aspect that the paper does not even
mention, but that is probably important to at least discuss,
is that of the impact that the widespread use of the proposed
technique would have on its performance. In other words, if
the approach is successful and widely followed, it is likely
to affect the performance of the network paths that Akamai
identifies as "good". This could defeat the original purpose
of performance improvement, and may even lead Akamai to change
its approach, or possibly try to somewhat hide its results.
There should at a minimum be an acknowledgment that this could
be an issue. |
Your qualifications to review this paper |
I know a lot about this area |
Novelty of paper |
This is a new contribution to an established area |
Overall paper merit |
Score 4: Top 10-20%. Soft Accept. I'm inclined to
accept it - I would like to see it in the program, but I am
not arguing strongly in favor of the paper. Note that most
papers in the program will probably have an average score of
about 4. |
Provide detailed comments to the author |
Let me state up front that I like the initial idea of
leveraging someone else's work/infrastructure in order to
solve a challenging problem.
Using large-scale
overlays to allow one-hop alternate routes to bypass
performance problems on the direct path is clearly an
interesting proposition, especially in light of the increasing
availability of P2P technology that can relatively easily make
it feasible. In such a context, the problem is more to find
what are good alternate paths (out of possibly many), than to
allow/support the use of an alternate path (e.g., see the
forthcoming INFOCOM 2006 paper entitled "How to Select a Good
Alternate Path in Large Peer-to-Peer Networks" for a similarly
motivated paper). In that respect, the idea of leveraging the
fact that Akamai is continuously monitoring a large number of
paths is clearly a good idea.
The one caveat with this
approach, one that is acknowledged in Section 5.4, is that
Akamai will only give you half of the answer, and it is also
not clear how to get the "right" answer from Akamai for
different destinations, i.e., is querying Akamai for Yahoo, as
you do in Section 5.1, always providing the best/a good answer
across destinations all over the Internet. You provide some
evidences that this may be OK, but the data is far too limited
to enable a solid and convincing conclusion.
Let me
next make a few more pointed comments directed to specific
places in the paper.
In the intro, you mention that for
overlay routing to be able to use the Akamai info, it is also
necessary that the network be able to map some of its nodes to
Akamai edge servers, but you say nothing of how this can be
done until later in the paper. I would suggest addressing this
up front.
I'll come back to that later, but having the
Akamai redirected paths outperform the direct paths 25% of the
time is not a great statistics. In the context of a large
overlay network, you should compare this to what randomly
picking an overlay node would yield.
You only define
that performance really means latency in the last paragraph
before Section 1.2, while you have repeatedly mentioned
performance before. In addition, you wait until page 12 to
argue why latency is the most important metric. I'm not sure I
agree with that position, but irrespective of that you need to
have that discussion earlier on.
Figure 8 had me
puzzled as to why the ordering of nodes was so consistent
(continuously increasing average rank across all Akamai
customers) until I spotted footnote 8. You may want to move
that explanation in the text or in the
caption.
Speaking of figures, I found figure 10
confusing as you never directly state that nodes on the left
and right of the range 20-30 correspond to different relative
values giving the same absolute value. It's sort of there in
the text, but easy to miss.
As you point out, the setup
of Fig. 11 forces a symmetric path. In addition to the problem
of performance information for the segment between the overlay
node and the destination not being always available, it is
also not clear that this always represents the best choice,
e.g., the direct return path might be much better (no
congestion) than the forward direct path.
The
discussion of what Fig. 13 reports is very confusing. You say
it reports differences in latency between the one-hop overlay
and the direct path measured over short time-scales, but then
you have an experiment running over 3 days. So is Fig. 13
reporting the average of these small time-scale differences,
and if yes what is the duration of each measurement interval?
You need to better explain the data for this
figure.
Still in relation to Fig. 13, you say that for
50% of the scenarios the best measured one-hop Akamai path
outperforms the direct path, but this is a bit of a stretch in
that for close to 30% of them the difference is pretty much
nil.
In the path pruning scenario, I am assuming that
the Akamai DNS is queried for a given customer that is
independent of the actual destination for the direct path. As
discussed earlier, it is key to properly assess the
sensitivity of the scheme to this choice across a broad range
of destinations, especially since you now have no visibility
into the segment from the redirect server to the
destination.
It is very hard to distinguish the
different line styles in Fig. 14.
You point to a
"sharp" decline at 2hrs in the performance of BTAS, but omit
to mention that there is a steady and non-negligible decline
up to that point, i.e., the difference in slopes before and
after is not that substantial. In addition, can you explain
why performance seems to be improving as the update interval
increases beyond 500 minutes or so? This seems
counter-intuitive. | |