Public review by Jitendra Padhye:
This paper poses an intriguing question: is the exponential backoff in TCP really necessary? The general perception is that the backoff is necessary to maintain the stability of the Internet under extremely heavy congestion. The authors, however, show that exponential backoff is not necessary for maintaining stability, as long as the packet conservation principle is followed.
The paper is well-written, and the evaluation is generally adequate. However, it is not clear to me that we should rush to remove exponential backoff from TCP implementations just yet. The reasons are as follows. First, as the authors themselves point out, their analysis breaks down when the RTT suddenly changes (perhaps due to a routing change) such that the new RTT is greater than the previously calculated RTO. It would be interesting to know how often this occurs.
Second, one of the downsides of removing exponential backoff is that it takes the network longer to recover from heavy congestion. The impact of this extra ``delay’’ should be evaluated carefully.
Third, it is not clear that the benefits of removing exponential backoff are significant. In some of the scenarios discussed in the paper, only a small fraction of the flows see any improvement in performance.
Fourth, the proposed plan for incremental deployment is not satisfactory. Since flows that do not use exponential backoff (TCP*(?)) hurt the performance of flows that use traditional TCP, the authors propose to go in two small steps wherein a less aggressive version (TCP*(3)) is deployed first. However, even this less aggressive version seems to reduce throughput of traditionalflows by 20% or more in some cases. Then, when TCP*(?) is deployed, it seems to affect the performance of TCP*(3) substantially as well! And there may still be some traditional TCP flows left in the network!
Finally, since this change will be perceived as 'drastic,' many more (and large scale) studies will be required to convince audiences such as the IETF to standardize TCP*(?).
I like this paper, but the authors have not convinced me that exponential backoff needs to be removed.
The core argument made by the authors is that as long as the packet conservation principle is followed, exponential backoff is not necessary for ensuring stability. In other words, if the RTO is really a good upper bound on RTT even in periods of heavy congestion, then there is no need to increase it further, even if packets are dropped.
The authors deserve high marks for boldness: they question one of the basic mechanisms that (supposedly) ensures the stability of the internet.
However, showing that exponential backoff is unnecessary is not sufficient to justify its removal.
There are three problems.
First, as the authors themselves show, flows that don't use exponential backoff can impact the performance of "traditional" TCP flows (yes, even the TCP*(3) version can increase the response time of o\vanilla TCP by 20-40%).
Second, it is not clear that by removing exponential backoff brings significant benefit. Even in Figure 3, where the authors claim that the performance improvement is "dramatic", only about 1% of the flows see any improvement in response time (later, the authors themselves point this out in Fig 4).
Third, given that we currently believe that this mechanisms is essential for the stability of the Internet, more evaluation is needed. Specifically, direct evaluation in the wide area internet would have been very useful (e.g. using planetlab).
In short, this is an interesting, though-provoking and well-written paper, but I am not yet ready to dump the good old exponential backoff!
Here's my take on "Removing Exponential Backoff from TCP" by Mondal and Kuzmanovic.
They make an interesting argument that exponential backoff was adopted in haste to solve an immediate problem and thus never underwent a rigorous review. They further argue that upon review, it doesn't appear to be appropriate. The paper raises some intriguing questions, I wish I could have devoted more time to this review.
My inclination would be accept.
Reasons to accept:
- Paper is mostly well-written, with only a few notable lapses in places.
- The authors are willing to revisit and re-examine one of the long-held tenants of TCP.
- They provide a reasonable background explanation of the current status-quo.
- They make reasonable, if limited, test studies of their proposed alternatives.
- They report on actual experiments in a test network instead of relying solely on simulation results.
Reasons to reject:
- While they put a lot of effort into making their claim that TCP's exponential backoff behavior is not needed (at least with regards to preventing congestive collapse), they put very little effort into explaining why it is harmful. Mostly they argue that it is not provably appropriate.
- The third of their three "rationales for revision" is very muddled. They make a statement about current Internet characteristics, then say that their objection to TCP's exponential backoff algorithm is independent of current Internet chrematistics, and this is somehow a rationale for revision? Instead, their objections to TCP's exponential backoff (as exhibited in rationales one and two) are the actual rationales.
- What they call the "implicit packet conversion principle" is fairly well-known, although it's not entirely clear whether or not they are claiming to be the first to have made this observation.
- When making a strong claim against a long-held piece of conventional wisdom, it behooves one to have done extensive studies that support your argument. While they did do some test studies of limited topologies in their lab and some larger ones in simulation, more studies are needed covering a wider set of situations. Some of their experiments, such as the measurement of the degree to which exponential backoff happens in today's Internet, were done on questionably small scales.
The paper poses an intriguing question: is the exponential backoff in TCP really necessary? it correctly identifies that what is minimally needed for stability is not exponential backoff but packet conservation. It then shows using simulation and emulation that hell does not break loose and performance improves under certain scenarios.
The exploration, however, is by no means slam dunk. There are several issues that are left unexplored. This is perhaps unavoidable for a CCR submission, especially because the question is a fundamental one. But I do hope that youll explore this question further in more detail. For instance, one downside of what you propose is the time it takes for the network to recover after heavy congestion. While both exponential backoff and packet conservation produce stability, the former presumably leads to faster recovery. Another issue, which I think you do allude to in the paper, is what happens when RTT changes all of a sudden and RTO < RTT. This can happen, for instance, when routing changes over to a congested path.
While the presentation quality of the paper is good, some of the claims should be toned down or backed up properly. For instance, i don't understand how your work "opens the doors for evaluating other well-accepted pieces of TCP congestion control." Also, your point about incremental deployability, which you bring up prominently in several places, is a bit ludicrous. You essentially say that because one big step is too big, lets have two small steps. What is the transition plan? It seems that you'll have TCP and TCP* operating simultaneously. (And I would have liked to see that the two small steps are small enough.)
Finally, how does your work relate to various high-speed TCP strains and TCP over wireless? Like you, both try to reduce the impact of individual losses on the sending rate of TCP.