Skip to topic | Skip to bottom
Home
Main
Main.UIMAr1.2 - 19 Oct 2005 - 15:11 - MarkSmuckertopic end

Start of topic | Skip to actions
Getting started with UIMA

You'll need to download it:

http://www.alphaworks.ibm.com/tech/uima/download

IBM is a pain, they make you register to download, but often one can just use http://www.bugmenot.com/ to get yourself a login and password. Or, I've downloaded the files available for you and you can find them at:

/usr/dan/users10/smucker/dea/uima/uima-downloads

Once installed, or within the SDK zip, you'll find the UIMA documentation. You'll want to start with the UIMA_SDK_Users_Guide_Reference.pdf.

IBM really pushes the use of the Eclipse IDE:

http://www.eclipse.org/

I did my coding from within this tool, but others avoid it completely and use traditional editors. IBM's selling point is that in chapter 3 of the SDK guide, they explain how to setup Eclipse to provide tools for easier editing of configuration files that UIMA uses.

Chapter 4 gets you going on building your first annotator. The end goal of the UIMA side of this exercise will the the creation of a CPE (Collection Processing Engine). That's in chapter 5 - think readers and consumers.

To make your life easier, I suggest modifying IBM's examples, which they use in the guide and for which they supply code.

Gotchas:

As usual with Java stuff, it helps to put your developed classes in your CLASSPATH. The packaged UIMA scripts, like runCPE.bat, don't put the classpath environment variable into the classpath and you will need to modify the script if you want it to have it.

Indri Notes


Download latest version: http://www.lemurproject.org/

Trevor's page is very useful: http://ciir.cs.umass.edu/~strohman/indri/

To understand extents and contexts, you'll need to read Don's pages: http://ciir.cs.umass.edu/~metzler/presentations/uiuc-indri.pdf http://ciir.cs.umass.edu/~metzler/indriquerylang.html http://ciir.cs.umass.edu/~metzler/indriretmodel.html

For those interested in producing offset annotations, it appears that Indri 2.1 supports them: http://www.lemurproject.org/lemur/offsetannotations.html

My marked up documents can be found in: /usr/dan/users10/smucker/dea/uima/toy-collection

I processed the ft91.dat file from trec_vol_4.

Sentences are marked with the tag: ciir.uima.SentenceAnnotation

-- MarkSmucker - 17 Oct 2005
to top


You are here: Main > UIMA

to top

Copyright © 1999-2008 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback