Review for Internet Cache Pollution Attacks and Countermeasures

First reviewer's review:

>>> Summary of the paper <<<

The paper describes two Internet cache pollution attacks on the client side. In locality-disruption attacks, the attackers try to ruin the file locality of the cache by accessing many different unpopular files. In false-locality attacks, the attackers repeat requesting the same set of files, which places these files in the cache while expelling other files out of the cache. The paper also gives some simple solutions to the attacks.

>>> Comments <<<

The paper describes two interesting attacks that cause cache pollution and network degradation. Through extensive simulations, the paper shows that the attacks can effectively drive the hit ratio down by occupying the cache space with files not interested by the users. The solutions are straightforward and effective, which is backed up by simulations. However, the paper also has a number of problems.

(1) The identified attacks are not as serious as the classical DoS attacks, which targets at the servers. The classical DoS attacks block out a server at a remote location, so that it is not accessible by the clients from the whole Internet. The cache pollution attacks in this paper degrade the network performance for hosts in the same local network. It neither denies the hosts from accessing the network, nor has any impact on the rest of the Internet.

(2) In order for the attacks to be successful, the attackers have to access the fils at a rate much higher than the other local hosts combined. This rids the attacks of their stealth, and makes countermeasure not a challenging task. A simple solution is to identify the hosts that make the most file accesses (in terms of number and volumn) and then give only a small portion of cache space to them.

(3) The purpose of using Bloom filter should be better explained. The memory overhead of 1MB per host seems too high.

Some minor comments:

(4) On Page 4, "A small percent of malicious requests (e.g., below 4%) is capable of ..." 4% of what?

(5) In the second paragraph in Section IV, it is not true that most residential users obtain their IP addresses through the DHCP protocol. That may be true for dialup connections, not for cable and DSL. The same thing is also not true for business users because the cache server is normally behind the NAT device on the same side as the hosts.

(6) On Page 7, the sentence, "On the other hand, other clients could ...", does not fit in the context. Do you point out this case that your solution cannot handle?

>>> Points in favour or against <<<
Refer to item 6.

Second reviewer's review:

>>> Summary of the paper <<<

This paper proposes two different types of internet cache pollution attacks, Locality-disruption attack and False-locality attack. These attacks can degrade the caching service by injecting unpopular files. To evaluate the effect of pollution attack, authors carried out extensive simulations, which show these attacks can severely affect the service. Additionally, this paper presents counter pollution techniques which are shown through simulation very effective. A prototype is also developed for experiments to further confirm the effectiveness of their counter measures.

>>> Comments <<<

Since this kind of attack (by destroying the cache locality) is currently not recorded, paper should show more evidence why this attack is possible. As over millions files cached, the attack should be very strong and may overrun the resource it has. While DDoS type attacks are possible, how can they know they are target on one cache server? And shall and how they have disjoint unpopular file set? I think the attacsk proposed by the authors are not easy to carry out. For each subnetwork, the attacker has to compromise some machine in it and the attacker has to guarantee that its requests are more than good users. If the subnetwork is small, it is not interesting for the attacker to attack it. If the subnetwork is larger, say an ISP, the attacker needs large attack resource. In order to impact a large scale of network/Internet, the attacker needs control lots of machines. Additionally, the detection proposed needs to trace every client, which is also not scalable.

I also doubt the claim that these attacks can severely degrade the service. Usually hit ratio decrease in internet cache will not so important since the latency is acceptable. Moreover, the metric used in the paper is based on hit rate seems too casual. Hit rate doesn't discriminate popular files and unpopular ones. A cache server can be still considered to function well if hit rate for popular files are good enough.

Simulation is carried out extensively in this paper. However, paper should answer the question that if these simulations are typical abstraction from reality or not? For example, what is the typical position for these cache servers and why the effect from underlying topology can be ignored? For a caching server for P2P, why is fetch-at-most -once is hold, especially when NAT is considered? One missing part is the effect of regular clients numbers. The number of request can not complete reflect this, especially in the counter measure.

The organization of this paper can be improved. Prototype implementation seems not give any new information. The " a)" in page 7 is wired.

However, this paper presents an interesting view on internet cache attack and potentially can be a good one if improvements are made. This paper may attract some security research focus on internet cache service.