'Know-more' - Inference detection in the age of Google

Team: Sagnik Dhar, Sumati Priya

Online search, especially with Google’s advent, has grown to be extremely powerful. Just as smarter algorithms are being developed everyday to make search more efficient, a prudent mind would realize that these algorithms also end up making ‘inference’ that much easier. In this project, we analyze inference detection on the web and how it is becoming an increasing threat to privacy. We do this by modeling a real-life scenario of various classes of searchers searching for individuals on the web. Given a set of informative keywords about a person, our algorithm predicts how many  searches a member of a certain searcher class would do, and which keywords he would use more preferentially. We carry out the Google searches and compute values for the results which we believe predicts their relative usefulness. Using the similar setup, we design two more experiments which try to ‘infer’ potentially useful data, given a set of informative keywords. Through this, we try to bring out the the various connotations of ‘inference’ in the context of the web and also try to predict which attributes about a person reveal the most about him.

(The complete report)

Mail me if you want to take a look at the code or the dataset we used.
Sagnik Dhar,
Jan 8, 2010, 10:01 PM