Review for Denial-of-Service Resilience in Peer-to-Peer File Sharing



Review #1
Systems
 Reviewer type :       PC member
 Reviewer:     0-1

 Originality: [3]
 Technical merit: [3]
 Readability: [3]
 Relevance: [3]
 Overall rating: [3]
 Recommended Action:   Weak Accept
---Comments for the Author---

Overall the paper is well written with clear explainations about the motivation and the problem to be addressed, though some parts are repetitive. The problem under consideration which is the resiliency of P2P networks against attacks. It is interesting to analyze the design tradeoff of performance vs. resiliency in P2P architecture.

The authors used simple analysis together with simulations to evaluate various P2P architectures. I find the work substantial enough for publication in Sigmetrics, although I have the reservation that the analyses are too simplified.



Review #2
Systems
 Reviewer type :       PC member
 Reviewer:     1-1

 Originality: [3]
 Technical merit: [4]
 Readability: [4]
 Relevance: [3]
 Overall rating: [3]
 Recommended Action:   Weak Accept
 
---Comments for the Author---

1) What are the main strengths of the paper that would warrant acceptance (2-4 sentences)? (For example, the paper addresses or resolves an important open issue, provides major new insights, develops a useful methodology or conceptual framework, or opens up a new area of research.)

The paper does a careful and thorough analysis of the resilience of file sharing systems like Gnutella to denial-of-service attacks. It uses both analytic models as well as simulation. It studies a number of different attack scenarios, two overlay configurations (traditional Gnutella superpeer and DHT-based Structella), and various defenses against attacks. The paper is also very well written.

2) What are the main weaknesses of the paper that might prevent acceptance (2-4 sentences)? For example, the paper presents no novel or substantial results, the paper makes very unreasonable assumptions, fails to offer any validation by experiments/proofs/simulations, or contains major technical flaws.)

How compelling is the problem?

3) Detailed comments (primarily for the authors). Please be as specific as possible. For example, provide a specific reference when a result is not new, or explain why you believe a particular assumption may be unreasonable.

The paper does a careful and thorough analysis of the resilience of file sharing systems like Gnutella to denial-of-service attacks. It uses both analytic models as well as simulation. It studies a number of different attack scenarios, two overlay configurations (traditional Gnutella superpeer and DHT-based Structella), and various defenses against attacks. The paper is also very well written.

I enjoyed the paper and found the results interesting and, to a certain extent, surprising.

I recommend that the paper be considered for acceptance. The degree to which the paper is compelling depends on ones take on the importance of the problem. As advertised, I am less concerned about the resilience of peer-to-peer file sharing systems like Gnutella to attack than many other systems we use. So my high-level question is, to what extent are the results applicable to other services and applications built on top of peer-to-peer systems?

Other comments:

Sec 1: The statement at the end of para 1 strikes me as superfluous. So what? I assume you are worried about traffic being increased (if it decreases, then I don't see what harm is done). Any large-scale network event can dramatically change traffic mixes. For example, worm traffic is negligible normally, but a new outbreak can dominate the entire Internet.

Sec 3.2: I like that you created a model to explain the behavior seen on Gnutella w/r/t prevalence of good and bad copies of files. In the end, though, does this mean that the attack is successful or not? Given that this happens on Gnutella, yet Gnutella users still actively use the system and eventually download good copies, is the implication that users have coped with the attack and that it ultimately fails?

Sec 6.4.2: Similarly, when I look at the results in Figure 13, they strike me as rather decent -- goodput in the system remains relatively high. But it also leads me to wonder about goodput as a measure of attack success. A low goodput implies the system has high overhead to overcome attack, but the system is still functional: I can still download all the music that the RIAA does not want me to download, it just might take me 2-3 times as long to do it. Does that mean that the attack was successful? Not clear to me.



Review #3
Systems
 Reviewer type :       PC member
 Reviewer:     2-1

 Originality: [4]
 Technical merit: [3]
 Readability: [3]
 Relevance: [3]
 Overall rating: [3]
 Recommended Action:   Weak Accept
 
---Comments for the Author---

Strengths:
The topic addressed in this paper is original and has, to my knowledge, not received a lot of attention. The set of features considered, i.e. attacks (file- targeted, network targeted: baseline or slow-node), counter-measures(randomization and reputation), and P2P system structures (hierarchical a la Gnutella or non-hierarchical, with consideration of path lengths for queries) is quite rich. Thus a number of interesting claims arises (inherent weakness of reputation systems, relative weakness of super-node based systems, strength of baseline attack compared to slow-node attack), and overall the paper is both interesting and thought-provoking.

Weaknesses:
The analytical modeling is very simple, which is good for building some intuition, but may leave readers not totally convinced about the validity or scope of applicability of the conlusions. A number of points are raised but not fully addressed, like the trade-off between performance without attack and under attack.

Detailed comments:
-Some work in theoretical CS (in particular "Censorship resistant Peer-to- Peer Content Addressable Networks"), by A. Fiat and J. Saia, ACM-SIAM Symposium on Discrete Algorithms, 2002) deals with the issue of building P2P systems that resist some fraction of "spammers"; basically these would be doing the same as your baseline attack. This work should be mentioned.

-Typo on p.2: "received. a probability that decreases exponentially fast Furthemore..." does not make sense.

-p.4, "Finally, it could be shown that equations similar to Equations (2) and (3) govern the spreading of polluted copies." --> If I am correct, these would incorporate some (1-p_s) factors; it may be better to give the modified equations explicitly.

-p.5, Sec. 4.3, description of "Best" strategy: can you detail further how lowest estimated delay is evaluated? The explanation between parentheses of how delay is estimated is unclear.

-p.6, Sec. 5.1. The model here does not sound natural. Why would users want to have the exhaustive lists of other users detaining the item? Also, the rationale for Eq. 5 is unclear: the path length would not be given in practice; super-nodes would rather disseminate the query till either a malicious super-node or a super-node able to answer is reached.

-p.7, after Eq. (7): it should be stated explicitly what graph model this is, and that the computation of D(f) is not exact, but an approximation.

-p.8, Line 4 and further below: is N the system size? It could be a free parameter. Specify which it is. In particular, if it is a configurable parameter, setting it to 1500 in figure 6 is probably a bad choice; a smaller value might be better.

-p.9, "for example, with 10% malicious supernodes" should be 20% in view of figure 7.