Review for Measuring Service in Multi-Class Networks

Review 1

competence contribution presentation summary
0.6 (somewhat familiar with this area of research) 3 (3: Fair contribution) 1 (1: Requires major work) 2 (2: Weak Reject)

The paper addresses an important topic, viz., user verification of network QoS through non-invasive means. The study is focused on router level QoS mechanisms, and provides techniques for users to distinguish between different scheduling mechanisms (Priority, WFQ, EDF) and estimate the parameters of the service.

In my opinion, there seems to be a mismatch between the needs of QoS verification and the methodology adopted in the paper. My primary concern is that the presentation is poor and difficult to follow.

A. The authors fail to distinguish between QoS mechanisms (policing, buffer management, scheduling) at an individual network element, and the network characteristics of the service delivered by the provider to the customer. For instance, a provider may offer a "Gold service" with delay and loss assurances, and may use multiple mechanisms (EDF and priority queueing) within the network to deliver this service. In this network packets of a single class may not travel the same sequence of network nodes, and could become re-ordered. Does it even make sense to model, interpret or validate this NETWORK service using a ROUTER model? Stated in an other way, is there any hope of extending the authors' methodology to a muliple router scenario?

B. The sampling or probing methodology is unclear. The authors mention the use of streams of packets with sequence numbers and timestamps. However, the experimental setup is not quite clear in terms of what are the raw data sampled and how the estimators are constructed. In particular, the statistical estimators used are sensitive to the fact that multiple classes be jointly backlogged at the time of sampling. How is this achieved? If there is unknown cross-traffic, how can we be sure about the backlogged status of the router?

C. The sensitivity of the estimation process to time scales is not adequately explored, especially as "all time scales are not guaranteed to infer the same scheduler" and hence "the final decision is made by using the majority rule over all time scales."

I would ask that the paper be rejected in its current form, with the recommendation that the authors work more on the justification for and presentation of the work.

Review 2

competence contribution presentation summary
0.6 (somewhat familiar with this area of research) 2 (2: Marginal contribution) 5 (5: Very good) 2 (2: Weak Reject)

This paper describes how to infer the service discipline used in a router, and after that infer service parameters for different classes, using a passive monitoring approach, and hypothesis testing techniques.

I have several concerns about the paper:

1. I can't think of practical use of the solution, especially given its relative complexity. For example, wouldn't a service provider document and tell its clients the kind of service that their routers provide, and

what performance properties users can expect?

2. The monitoring approach is only shown to work if the performance monitored is due to processing by a single router. In practice, performance can only be observed for an end-to-end path, with > 1 hops, and possibly heterogeneous service disciplines used in different hops. Will these techniques still work? If not, then paper's only addressing a toy problem.

3. The experimental results address simplified cases, e.g. only EDF vs SP vs WFQ for possible disciplines, and two WFQ rates for parameter discovery. They don't not convince that they will generalize to cases in which the possible candidate disciplines are more diverse (including other algorithms or even combination of algorithms), or when an WFQ is shared by flows with many different rates, and some of these rates are close to each other.

In summary, while I find such use of the estimation/hypothesis testing theory interesting, I'm skeptical about the need and practicality of the proposed solutions.

Review 3

competence contribution presentation summary
0.6 (somewhat familiar with this area of research) 3 (3: Fair contribution) 5 (5: Very good) 4 (4: Weak Accept)

The paper provides a statistical technique for inferring the type of scheduling used by a switch, and certain parameters of the inferred scheduler. The data for the inference is the empirical service envelopes of traffic through the router. A maximum likelihood estimator for a Gaussian parameterization of the envelopes is developed. The authors carry out some experimental evaluations.

This seems a reasonably interesting paper, applying rigorous statistical techniques to a networking problem.

Some comments:

1) Can the authors provide some more argument on how the results of their inference would be used in practice? What actions would end-points or applications take on the basis of the results.

2) What are the limitations of the Gaussian approach. Does this limit the applicability to inference on highly aggregated flows? It seems the approach should generalize without this assumption, although at the cost of some additional complexity.

3) What is the behavior of the inference if the actual discipline does not belong to one of the classes considered in the model? Can one use the maximizing likelihood in a statistical test of the hypothesis that the actual discipline is among the model class?