Fernando Diaz

Senior Researcher
Microsoft Research

Adjunct Associate Professor
Department of Computer Science
New York University

Visiting Researcher
Center for Urban Science and Progress
New York University

diazf [at] acm [dot] org


My primary research interest is information retrieval, the formal study of searching large collections of data for small bits of information. The most familiar instance of information retrieval is web search where users search a collection of webpages for one or a few relevant webpages. Information retrieval, however, goes beyond web search and includes topics such as cross-lingual retrieval, personalization, desktop search, and interactive retrieval. My research experience includes distributed information retrieval approaches to web search, interactive and faceted retrieval, mining of temporal patterns from news and query logs, cross-lingual information retrieval, graph-based retrieval methods, and exploiting information from multiple corpora. In my dissertation work, I studied the relationship between document clustering and document scoring for retrieval using methods from machine learning and statistics. As a result, I developed an algorithm for system self-assessment and self-tuning which significantly improves the performance of retrieval algorithms across a variety of corpora. At Microsoft, I study web search, specifically in the context of unexpected crisis events.

Detailed information can be found on my curriculum vitae.



F. Diaz, "Autocorrelation and Regularization of Query-Based Retrieval Scores," 2008.


J. Arguello and F. Diaz, "Relevance Ranking of Vertical Search Engines," Vertical Selection and Aggregation. Elsevier, 2013.


M. Imran, C. Castillo, F. Diaz, S. Vieweg, "Processing Social Media Messages in Mass Emergency", ACM Comput. Surv. 47, 4, Article 67 (June 2015), 38 pages.

D. G. Goldstein, S. Suri, R. P. McAfee, M. Ekstrand-Abueg, F. Diaz, "The Economic and Cognitive Costs of Annoying Display Advertisements", Journal of Marketing Research, December 2014, Vol. 51, No. 6, pp. 742-752. Finalist: Paul E. Green Award, Journal of Marketing Research

H. Purohit, C. Castillo, F. Diaz, A. Sheth, P. Meier, "Emergency-Relief Coordination on Social Media: Automatically Matching Resource Requests and Offers", First Monday, January 2014.

E. Yom-Tov and F. Diaz, "The Effect of Social and Physical Detachment on Information Need," ACM Transactions on Information Systems, January 2013.

Y. Chang, A. Dong, P. Kolari, R. Zhang, Y. Inagaki, F. Diaz, H. Zha, Y. Liu. "Improving Recency Ranking Using Twitter Data," ACM Transactions Intelligent Systems Technology, February 2013.

F. Diaz, "Regularizing Query-Based Retrieval Scores," Information Retrieval, December 2007. draft available here.

R. Jones and F. Diaz, "Temporal profiles of queries," ACM Transactions on Information Systems, July 2007. draft available here.


F. Diaz, "Pseudo-Query Reformulation", 2015.


F. Diaz, "Condensed List Relevance Models", ICTIR 2015.

C. Kedzie, K. McKeown, and F. Diaz, Predicting Salient Updates for Disaster Summarization", ACL 2015.

M. Shokouhi, R. Jones, U. Ozertem, K. Raghunathan, F. Diaz, "Mobile Query Reformulations", SIGIR 2014. [bib]

A. Olteanu, C. Castillo, F. Diaz, S. Vieweg, "CrisisLex: A Lexicon for Collecting and Filtering Microblogged Communications in Crises", ICWSM 2014.

P. Metrikov, F. Diaz, S. Lahaie, J. Rao, "Whole Page Optimization: How Page Elements Interact with the Position Auction", EC 2014. [bib]

P. Golbus, I. Zitouni, J. Kim, A. Hassan, F. Diaz, "Contextual and Dimensional Relevance Judgments for Reusable SERP-level Evaluation", WWW 2014. [bib]

F. Diaz, R. White, D. Liebling, G. Buscher, "Robust Models of Mouse Movement on Dynamic Web Search Results Pages", CIKM 2013. [bib]

M. Imran, S. Elbassuoni, C. Castillo, F. Diaz, P. Meier. "Extracting information nuggets from disaster-related messages in social media". In 10th International Conference on Information Systems for Crisis Response and Management, 2013. Best Paper

Q. Guo, F. Diaz, E. Yom-Tov. "Updating users about time critical events". In Proceedings of the 35th European conference on Advances in Information Retrieval (ECIR'13), 483--494. [slightly extended]

J. Arguello, F. Diaz, J. Callan, "Learning to Aggregate Vertical Results into Web Search Results", CIKM 2011.

E. Yom-Tov and F. Diaz, "Out of Sight, Not Out of Mind: On the Effect of Social and Physical Detachment on Information Need", SIGIR 2011. Best Paper Honorable Mention

J. Seo, F. Diaz, E. Gabrilovich, V. Josifovski, B. Pang, "Generalized Link Suggestions via Web Site Clustering," WWW 2011.

J. Arguello, F. Diaz, J. Callan, B. Carterette, "A Methodology for Evaluating Aggregated Search Results," ECIR 2011. Best Student Paper Award

J. Bai, F. Diaz, Y. Chang, Z. Zheng, "Cross-Market Model Adaptation with Pairwise Preference Data for Web Search Ranking," COLING 2010.

J. Arguello, F. Diaz, J-F. Paiement, "Vertical Selection in the Presence of Unlabeled Verticals," SIGIR 2010.

F. Diaz, D. Metzler, S. Amer-Yahia, "Relevance and Ranking in Online Dating Systems," SIGIR 2010. Selected for ICML 2011 Invited Cross-Conference Session

A. Dong, R. Zhang, P. Kolari, J. Bai, F. Diaz, Y. Chang, Z. Zheng, H. Zha, "Time is of the Essence: Improving Recency Ranking Using Twitter Data," WWW 2010.

A. Dong, Y. Chang, Z. Zheng, G. Mishne, J. Bai, R. Zhang, K. Buchner, C. Liao, F. Diaz, "Towards recency ranking in web search," WSDM 2010.

J. Arguello, J. Callan, F. Diaz, "Classification-based Resource Selection," CIKM 2009.

J. Arguello, F. Diaz, J. Callan, J-F. Crespo, "Sources of Evidence for Vertical Selection," SIGIR 2009. Best Paper Award

F. Diaz and J. Arguello, "Adaptation of Offline Vertical Selection Predictions in the Presence of User Feedback," SIGIR 2009.

F. Diaz, "Integration of News Content Into Web Results," WSDM 2009. Best Paper Award.

F. Diaz, "A Method for Transferring Retrieval Scores Between Collections with Non-Overlapping Vocabularies," SIGIR 2008 poster.

F. Diaz, "Improving Relevance Feedback in Language Modeling Retrieval with Score Regularization," SIGIR 2008 poster.

F. Diaz, "Robustness of Score Regularization to Similarity Perturbation," SIGIR 2008 poster.

F. Diaz, "Performance prediction using spatial autocorrelation," SIGIR 2007.

F. Diaz and D. Metzler, "Pseudo-aligned multilingual corpora," IJCAI 2007.

F. Diaz and D. Metzler, "Improving the estimation of relevance models using large external corpora," SIGIR 2006.

F. Diaz, "Regularizing ad hoc retrieval scores," CIKM 2005.

D. Kelly, F. Diaz, N. J. Belkin, J. Allan, "A user-centered approach to evaluating topic models.," ECIR 2004.

F. Diaz and R. Jones, "Using temporal profiles of queries for precision prediction," SIGIR 2004.

F. Diaz, "Using wearable computers to construct semantic representations of physical spaces.," ISWC 2002.


F. Diaz, "Experimentation Standards for Crisis Informatics," KDD Workshop on Data Science for Social Good 2014.

A. Hassan, R. Jones, F. Diaz, "Geographic Features in Web Search Retrieval," GIR 2008.

D. Metzler, F. Diaz, T. Strohman, W. B. Croft, "UMass at Robust 2005: Using mixtures of relevance models for query expansion," TREC 2005.

F. Diaz and J. Allan, "When less is more: Relevance feedback falls short and term expansion succeeds at HARD 2005," TREC 2005.

F. Diaz, M. D. Smucker, J. Allan, "High precision retrieval via user interaction and metadata," University of Massachusetts Amherst, 2005.

F. Diaz and R. Jones, "Temporal profiles of queries," Yahoo! Research Labs, 2004.

N. Abdul-Jaleel, J. Allan, W. B. Croft, F. Diaz, L. Larkey, X. Li, M. D. Smucker, C. Wade, "UMass at TREC 2004: Novelty and HARD," TREC 2004.

F. Diaz and J. Allan, "Browsing-based user language models for information retrieval," University of Massachusetts Amherst, 2003.



ACM International Conference on Web Search and Data Mining (WSDM 2014)


WSDM Workshop on the Ethics of Online Experimentation

SIGIR Workshop on Reproducibility, Inexplicability, and Generalizability of Results (RIGOR)

Social Web for Disaster Management

SIGIR Workshop on Time-Aware Information Access
2012, 2013, 2014

ACM Workshop on Social Web Search and Mining: Analysis of User Generated Content Under Crisis


Real-Time Summarization

Temporal Summarization
2013, 2014, 2015

2013, 2014


Web Search Engines
Department of Computer Science
Courant Institute of Mathematical Sciences
Spring 2013, Fall 2014

Experimental Design for Information Systems
University of Trento
Summer 2012

Advanced Information Retrieval and Databases
Department of Computer Science
School of Engineering
Spring 2011


latex-merge: merge a latex project into a single file (often asked for by ACM).