Research Assistant, Center for Intelligent Information Retrieval
University of Massachusetts Amherst
Phone: (413) 545-0728
Fax: (413) 545-1789
Email: balleste@cs.umass.edu
I am a PhD candidate in the Computer Science Department at the University of Massachusetts,
Amherst and am a member of the Center for Intelligent Information Retrieval. Bruce Croft
is my advisor. I am finishing my dissertation this summer and have accepted a faculty position in the
computer science department at Mount Holyoke College.
Dissertation Research
My dissertation research is in the area of Cross-language Information Retrieval.
These are systems that allow a person to query in one language (e.g. English)
and retrieve relevant documents in other languages (e.g. Spanish).
In my thesis research I have developed a very effective approach to cross-language
retrieval based on translation via machine readable dictionary, augmented with
statistical techniques for reducing the effects of translation ambiguity. The
statistical techniques are based on analysis of word co-occurrence in text.
Publications on my research in cross-language retrieval can be found here.
Research Interests
Much of my research experience has been in information retrieval, but I have a diverse
research background. I have also worked with Paul Cohen on applying AI techniquesto exploratory data analysis and causal induction. In addition, I spent a few
years doing basic research in Molecular and Cellular Biology.
My research interests span a broad spectrum of information access and knowledge
management issues. As the amount of on-line information continues to explode, there
is an increasing need for systems that facilitate the manipulation and analysis of
information. I am particularly interested in developing techniques for text analysis and
in exploring the ways in which these techniques can be used for other data types. Areas
of interest include the following:
data mining (automatically identifying significant relationships from large amounts
of information).
data fusion (combining different types of information)
extraction (identifying pre-specified types of information).
summarization (abstraction of the most important parts of text)
Curriculum Vitae
Publications