--======== Review Reports ========--

The review report from reviewer #1:

*1: Is the paper relevant to WI?
[_] No
[X] Yes

*2: How innovative is the paper?
[X] 5 (Very innovative)
[_] 4 (Innovative)
[_] 3 (Marginally)
[_] 2 (Not very much)
[_] 1 (Not)
[_] 0 (Not at all)

*3: How would you rate the technical quality of the paper?
[X] 5 (Very high)
[_] 4 (High)
[_] 3 (Good)
[_] 2 (Needs improvement)
[_] 1 (Low)
[_] 0 (Very low)

*4: How is the presentation?
[X] 5 (Excellent)
[_] 4 (Good)
[_] 3 (Above average)
[_] 2 (Below average)
[_] 1 (Fair)
[_] 0 (Poor)

*5: Is the paper of interest to WI users and practitioners?
[X] 3 (Yes)
[_] 2 (May be)
[_] 1 (No)
[_] 0 (Not applicable)

*6: What is your confidence in your review of this paper?
[X] 2 (High)
[_] 1 (Medium)
[_] 0 (Low)

*7: Overall recommendation
[_] 5 (Strong Accept: top quality)
[X] 4 (Accept: a regular paper)
[_] 3 (Weak Accept: could be a poster or a short paper)
[_] 2 (Weak Reject: don't like it, but won't argue to reject it)
[_] 1 (Reject: will argue to reject it)
[_] 0 (Strong Reject: hopeless)

*8: Detailed comments for the authors
The paper addresses the problem of understanding the features that actually Google uses to rank documents. The approach perform a kind of reverse engineering, i.e., based on results and a pool of features, applies learning methods to disclose the relevance of features.

The paper is very well written, very clear and convincing. The approach itself is convincing, and I agree with the idea of using the partitioning algorithm, in order to refine the evaluation of features' weights in a step-wise manner.

The evaluation is well conducted, and clearly discusses the outcomes.
Not less important, reading the paper it is possible to figure out how to repeat the experiments, and, possibly, how to build an equivalent system, if needed.

In my opinion, a good paper to recommend for strong accept.

Just a minor typo: as far as I know (but I'm not a mother tongue person) usually it is preferable to use "cannot" instead of "can not".

========================================================
The review report from reviewer #2:

*1: Is the paper relevant to WI?
[_] No
[X] Yes

*2: How innovative is the paper?
[_] 5 (Very innovative)
[_] 4 (Innovative)
[X] 3 (Marginally)
[_] 2 (Not very much)
[_] 1 (Not)
[_] 0 (Not at all)

*3: How would you rate the technical quality of the paper?
[_] 5 (Very high)
[X] 4 (High)
[_] 3 (Good)
[_] 2 (Needs improvement)
[_] 1 (Low)
[_] 0 (Very low)

*4: How is the presentation?
[_] 5 (Excellent)
[X] 4 (Good)
[_] 3 (Above average)
[_] 2 (Below average)
[_] 1 (Fair)
[_] 0 (Poor)

*5: Is the paper of interest to WI users and practitioners?
[_] 3 (Yes)
[X] 2 (May be)
[_] 1 (No)
[_] 0 (Not applicable)

*6: What is your confidence in your review of this paper?
[_] 2 (High)
[X] 1 (Medium)
[_] 0 (Low)

*7: Overall recommendation
[_] 5 (Strong Accept: top quality)
[_] 4 (Accept: a regular paper)
[X] 3 (Weak Accept: could be a poster or a short paper)
[_] 2 (Weak Reject: don't like it, but won't argue to reject it)
[_] 1 (Reject: will argue to reject it)
[_] 0 (Strong Reject: hopeless)

*8: Detailed comments for the authors
This paper provides interesting insights into Google’s ranking algorithm. Especially the finding that Blogs are ranked worse than normal websites is very interesting as well as the weight of the different ranking factors. However, there are several issues.

•While reading the paper I got the impression that the authors are not very familiar with related work in their field:
- The authors claim that they are the first providing real scientific evidence about how Google’s ranking algorithm is working and all other work in this field is rather “folklore”. This is not correct. For instance, Bifet et al 2005, “An Analysis of Factors Used in Search Engine Ranking” have published a paper which is pretty similar to the paper of the authors.
- Google itself has published a comprehensive guide for webmasters on how to design their websites to obtain a good ranking (http://www.google.com/webmasters/docs/search-engine-optimization-starter-guide.pdf). This guide does not describe all the 200 factors Google is using but provides specific instructions for authors what to do and what not to do. I think the authors should explain in their paper why they think this official guide is not sufficient.
- The authors write: “ANCH counts the number of occurrences of the keyword in the anchor text of an outgoing link”. Some research suggests that Google only counts the first anchor text on a page. It is not clear if this has been considered in the paper? See, e.g. http://www.seomoz.org/blog/results-of-google-experimentation-only-the-first-anchor-text-counts
• I cannot see a lot of value in Table II. The argument “Due to the lack of systematical measuring and evaluating guidelines, it is not surprising to see a huge difference of ranking between the three lists” is not valid. One study is from 2007, the second one is ongoing since 2007 and the last one is from 2009. To me, it seems likely that Google’s ranking algorithm has changed between 2007 and 2009 and therefore the different studies show different results. Two more issues:
- Why citing the SEOmoz’07 study when there is a more current study from the same institution but from 2009? http://www.seomoz.org/article/search-ranking-factors
- Regarding the third column labeled “Survey”: This “survey” is ongoing since 2007 and had in all this time only 44 participants. In my opinion such a “study” is meaningless and not worth being mentioned in a scientific article at all. Especially when considering the fact that the first participants in 2007 voted on a different Google algorithm than those in 2009 (assuming that Google is changing its algorithm from time to time)
- I am not sure if the set of keywords that was used for training and evaluation is really appropriate (see Table III). It consists solely of single-word keywords from four different categories. It seems not surprising that if the algorithm was trained with the term “supernova” it performs well with the term “galaxy” (in this case the pages of Wikipedia and NASA are ranked high). Why was no wider range of keywords selected for the experiment, including also multi word queries? I know that the authors describe how they additionally obtained keywords with Google Trends. However, I would feel more comfortable about the experiment if a random set of keywords (including multi-word queries) had been chosen instead of a biased set of keywords (biased in terms of the four categories or popularity).
Some minor issues:
• The authors state “We […] provide guidelines for SEOs and webmasters to optimize their web pages”. Although the paper shows how Google’s ranking algorithm is working, and most people would be able to know how to design their web pages to “fit” Google’s ranking algorithms, I cannot see real guidelines for webmasters in that paper.
• “In the following sections, we describe the two ranking models we experimented in this paper – Linear programming and SVM”. I know you explain what SVM stands for on the next page but imho you should not use the abbreviation SVM here as you have not explained it but write “Support Vector Machines” instead.
• In the paper the terms pagerank, page rank and Page Rank seem to be used synonymously. You should either stick to one term or define them if they have different meanings (page rank as in the rank of a page vs. Google’s PageRank).
• Regarding “Google claims to use more than 200 parameters in its ranking system.” I feel there should be a reference.
• As a side note: I wondered who the target audience is of this paper. I think that Google’s SEO starter guide is all that webmasters need to obtain a “fair” ranking for their website. A more detailed knowledge is rather interesting for spammers, in my opinion.

If the authors had performed the same study with a different set of keywords (or plausibly justified the selection of their keywords) and had demonstrated a more thoroughly knowledge of their field, I had given this paper a (strong?) accept to be presented at the conference. Now I gave a weak accept but I wouldn’t argue if a more competent reviewer than I am recommends a reject. If this paper eventually should be accepted I would recommend acceptance as full paper and not as poster.

========================================================
The review report from reviewer #3:

*1: Is the paper relevant to WI?
[_] No
[X] Yes

*2: How innovative is the paper?
[_] 5 (Very innovative)
[_] 4 (Innovative)
[X] 3 (Marginally)
[_] 2 (Not very much)
[_] 1 (Not)
[_] 0 (Not at all)

*3: How would you rate the technical quality of the paper?
[_] 5 (Very high)
[_] 4 (High)
[X] 3 (Good)
[_] 2 (Needs improvement)
[_] 1 (Low)
[_] 0 (Very low)

*4: How is the presentation?
[_] 5 (Excellent)
[_] 4 (Good)
[X] 3 (Above average)
[_] 2 (Below average)
[_] 1 (Fair)
[_] 0 (Poor)

*5: Is the paper of interest to WI users and practitioners?
[X] 3 (Yes)
[_] 2 (May be)
[_] 1 (No)
[_] 0 (Not applicable)

*6: What is your confidence in your review of this paper?
[X] 2 (High)
[_] 1 (Medium)
[_] 0 (Low)

*7: Overall recommendation
[_] 5 (Strong Accept: top quality)
[_] 4 (Accept: a regular paper)
[X] 3 (Weak Accept: could be a poster or a short paper)
[_] 2 (Weak Reject: don't like it, but won't argue to reject it)
[_] 1 (Reject: will argue to reject it)
[_] 0 (Strong Reject: hopeless)

*8: Detailed comments for the authors
The paper uses pair-wise ranking to learn a LP and polynomial SVM-rank ranking function to fit to google's ranking results. A recursive partition scheme is proposed to creative piece-wise linear ranking function to approximate non-linear ranking function.

The idea is reasonable and the results will be interesting to the audience of WI, but the novelty of the paper is weak.

========================================================