Index of /downloads/open-library
README
06-Nov-2012
Contact: Henry Feild (hfeild@cs.umass.edu)
CONTENTS
========
1. Overview
2. Files
3. References
1. OVERVIEW
===========
This collection consists of 46,561,553 metadata records crawled from the Open
Library on November 30, 2011 and click distributions over records for 22,622
queries recorded over the year October 2010 through September 2011. These data
are further described in [1].
2. FILES
========
The data is kept in two files:
open-library-metadata.tsv.bz2 (4.6 GB compressed, 34 GB uncompressed)
This is the metadata. It is in the tab-delimited format:
Different types have different metadata fields. The fields are in JSON
format. We do not offer a comprehensive schematic of the metadata
fields. Here is an example record:
/type/author /authors/OL1000057A 2 2008-08-20T17:57:09.66187 {"name": "Kha\u0304lid Muh\u0323ammad \u02bbAli\u0304 al-H\u0323a\u0304jj", "personal_name": "Kha\u0304lid Muh\u0323ammad \u02bbAli\u0304 al-H\u0323a\u0304jj", "last_modified": {"type": "/type/datetime", "value": "2008-08-20T17:57:09.66187"}, "key": "/authors/OL1000057A", "type": {"key": "/type/author"}, "revision": 2}
open-library-eval-set.tar.bz2 (1.3 MB compressed, 7.2 MB uncompressed)
This unpacks to a directory consisting of:
open-library-eval-set/
README
test/
train/
See the open-library-eval-set/README for details.
3. REFERENCES
=============
[1] J.Y. Kim, H. Feild, and M. Cartright. "Understanding Book Search on the
Web," CIKM 2012.