Homework one (HW1)
See the course schedule for the due date.
This homework is worth 20
points.
In this assignment you will be experience the joys and pains of relevance assessing. This is not an intellectually deep task, but it is central to almost all of the field of information retrieval, so there is value in understanding how it works, how difficult it is, and so on.
The class on "evaluation" indicated that IR evaluation requires a collection of documents, a set of queries, and judgments about which documents are relevant for each query. For this homework you will be participating in something called the Million Query Track of TREC (the Text Retrieval Conference). In that track, a number of search engines ran their systems on about 10,000 queries, returning the top 1,000 documents per query that the system "thought" were likely to be relevant. Individuals (such as yourself) can log into a judging system to provide some judgments for some of those queries -- providing information about which systems were doing better than others.
As of this moment, a little over 1,600 of the 10,000 queries have had some documents judged for relevance. Your homework assignment is to increase that number by four. On average, it has taken less than 15 minutes per query, so (on average) this homework is requesting an hour of your time.
To do this, go to http://burnie.cs.umass.edu/million/eval and follow the instructions. Here is what will happen:
Hand in something that includes your 1MQ userid, the four queries that you judged, along with any impressions you had of each query and/or of the entire evaluation process. This is likely to be about a page total.
This assignment will be graded primarily on whether or not you complete it. You will lose points for obvious attempts to treat the task lightly.