Fernando Diaz

fdiaz [at] microsoft [dot] com

Microsoft Research NYC



My primary research interest is information retrieval, the formal study of searching large collections of data for small bits of information. The most familiar instance of information retrieval is web search where users search a collection of webpages for one or a few relevant webpages. Information retrieval, however, goes beyond web search and includes topics such as cross-lingual retrieval, personalization, desktop search, and interactive retrieval. My research experience includes distributed information retrieval approaches to web search, interactive and faceted retrieval, mining of temporal patterns from news and query logs, cross-lingual information retrieval, graph-based retrieval methods, and exploiting information from multiple corpora. In my dissertation work, I studied the relationship between document clustering and document scoring for retrieval using methods from machine learning and statistics. As a result, I developed an algorithm for system self-assessment and self-tuning which significantly improves the performance of retrieval algorithms across a variety of corpora. At Microsoft, I study web search, specifically in the context of unexpected crisis events.



F. Diaz, "Autocorrelation and Regularization of Query-Based Retrieval Scores," 2008.


J. Arguello and F. Diaz, "Relevance Ranking of Vertical Search Engines," Vertical Selection and Aggregation. Elsevier, 2013.


H. Purohit, C. Castillo, F. Diaz, A. Sheth, P. Meier, "Emergency-Relief Coordination on Social Media: Automatically Matching Resource Requests and Offers", First Monday, January 2014.

E. Yom-Tov and F. Diaz, "The Effect of Social and Physical Detachment on Information Need," ACM Transactions on Information Systems, 31, 1, Article 4 (January 2013).

Y. Chang, A. Dong, P. Kolari, R. Zhang, Y. Inagaki, F. Diaz, H. Zha, and Y. Liu. "Improving Recency Ranking Using Twitter Data," ACM Transactions Intelligent Systems Technology, 4(1):4:1--4:24, February 2013

F. Diaz, "Regularizing Query-Based Retrieval Scores," Information Retrieval, December 2007. draft available here.

R. Jones and F. Diaz, "Temporal profiles of queries," TOIS, July 2007. draft available here.


M. Shokouhi, R. Jones, U. Ozertem, K. Raghunathan, F. Diaz, "Mobile Query Reformulations", SIGIR 2014.

A. Olteanu, C. Castillo, F. Diaz and S. Vieweg, "CrisisLex: A Lexicon for Collecting and Filtering Microblogged Communications in Crises", ICWSM 2014.

P. Metrikov, F. Diaz, S. Lahaie, and J. Rao, "Whole Page Optimization: How Page Elements Interact with the Position Auction", EC 2014.

P. Golbus, I. Zitouni, J. Kim, A. Hassan, F. Diaz, "Contextual and Dimensional Relevance Judgments for Reusable SERP-level Evaluation", WWW 2014.

F. Diaz, R. White, D. Liebling, G. Buscher, "Robust Models of Mouse Movement on Dynamic Web Search Results Pages", CIKM 2013.

M. Imran, S. Elbassuoni, C. Castillo, F. Diaz, P. Meier. "Extracting information nuggets from disaster-related messages in social media". In 10th International Conference on Information Systems for Crisis Response and Management, 2013. Best Paper

Q. Guo, F. Diaz, E. Yom-Tov. "Updating users about time critical events". In Proceedings of the 35th European conference on Advances in Information Retrieval (ECIR'13), 483--494. [slightly extended]

J. Arguello, F. Diaz, J. Callan, "Learning to Aggregate Vertical Results into Web Search Results", CIKM 2011.

E. Yom-Tov and F. Diaz, "Out of Sight, Not Out of Mind: On the Effect of Social and Physical Detachment on Information Need", SIGIR 2011. Best Paper Honorable Mention

J. Seo, F. Diaz, E. Gabrilovich, V. Josifovski, B. Pang, "Generalized Link Suggestions via Web Site Clustering," WWW 2011.

J. Arguello, F. Diaz, J. Callan, B. Carterette, "A Methodology for Evaluating Aggregated Search Results," ECIR 2011. Best Student Paper Award

J. Bai, F. Diaz, Y. Chang, Z. Zheng, "Cross-Market Model Adaptation with Pairwise Preference Data for Web Search Ranking," COLING 2010.

J. Arguello, F. Diaz, J-F. Paiement, "Vertical Selection in the Presence of Unlabeled Verticals," SIGIR 2010.

F. Diaz, D. Metzler, S. Amer-Yahia, "Relevance and Ranking in Online Dating Systems," SIGIR 2010.

A. Dong, R. Zhang, P. Kolari, J. Bai, F. Diaz, Y. Chang, Z. Zheng, H. Zha, "Time is of the Essence: Improving Recency Ranking Using Twitter Data," WWW 2010.

A. Dong, Y. Chang, Z. Zheng, G. Mishne, J. Bai, R. Zhang, K. Buchner, C. Liao, and F. Diaz, "Towards recency ranking in web search," WSDM 2010.

J. Arguello, J. Callan, and F. Diaz, "Classification-based Resource Selection," CIKM 2009.

J. Arguello, F. Diaz, J. Callan, and J-F. Crespo, "Sources of Evidence for Vertical Selection," SIGIR 2009. Best Paper Award

F. Diaz and J. Arguello, "Adaptation of Offline Vertical Selection Predictions in the Presence of User Feedback," SIGIR 2009.

F. Diaz, "Integration of News Content Into Web Results," WSDM 2009. Best Paper Award.

F. Diaz, "A Method for Transferring Retrieval Scores Between Collections with Non-Overlapping Vocabularies," SIGIR 2008 poster.

F. Diaz, "Improving Relevance Feedback in Language Modeling Retrieval with Score Regularization," SIGIR 2008 poster.

F. Diaz, "Robustness of Score Regularization to Similarity Perturbation," SIGIR 2008 poster.

F. Diaz, "Performance prediction using spatial autocorrelation," SIGIR 2007.

F. Diaz and D. Metzler, "Pseudo-aligned multilingual corpora," IJCAI 2007.

F. Diaz and D. Metzler, "Improving the estimation of relevance models using large external corpora," SIGIR 2006.

F. Diaz, "Regularizing ad hoc retrieval scores," CIKM 2005.

D. Kelly, F. Diaz, N. J. Belkin, and J. Allan, "A user-centered approach to evaluating topic models.," ECIR 2004.

F. Diaz and R. Jones, "Using temporal profiles of queries for precision prediction," SIGIR 2004.

F. Diaz, "Using wearable computers to construct semantic representations of physical spaces.," ISWC 2002.


F. Diaz, "Experimentation Standards for Crisis Informatics," KDD Workshop on Data Science for Social Good 2014.

A. Hassan, R. Jones, and F. Diaz, "Geographic Features in Web Search Retrieval," GIR 2008.

D. Metzler, F. Diaz, T. Strohman, and W. B. Croft, "UMass at Robust 2005: Using mixtures of relevance models for query expansion," TREC 2005.

F. Diaz and J. Allan, "When less is more: Relevance feedback falls short and term expansion succeeds at HARD 2005," TREC 2005.

F. Diaz, M. D. Smucker, and J. Allan, "High precision retrieval via user interaction and metadata," University of Massachusetts Amherst, 2005.

F. Diaz and R. Jones, "Temporal profiles of queries," Yahoo! Research Labs, 2004.

N. Abdul-Jaleel, J. Allan, W. B. Croft, F. Diaz, L. Larkey, X. Li, M. D. Smucker, and C. Wade, "UMass at TREC 2004: Novelty and HARD," TREC 2004.

F. Diaz and J. Allan, "Browsing-based user language models for information retrieval," University of Massachusetts Amherst, 2003.


latex-merge: merge a latex project into a single file (often asked for by ACM).