Handwritten Document Retrieval - Data Sets
Image/Feature Data Sets
The file format is gnu-zipped tar (tape archive). To unpack:
Each archive contains a file README.txt with instructions on how to use
- UNIX: gunzip -c archive.tgz | tar xvf -
- Windows: use WinZIP or WinRAR
Please note that by downloading any of the data on this web
page you agree to our copyright notice and to read the instructions in the README.txt file contained in each archive.
Do not share the datasets or the URL of this page! Please point interested people to the download page of the Center for Intelligent Information Retrieval (click on the button next to Word Image Data Sets).
- 20 pages of George Washington's manuscripts, with segmentation information and ground truth, i.e. annotations (58.5MB)
- Data set of good quality (low degradation) (15.8MB)
- Profile features ("time series") extracted from the preceding data set. (4.3MB)