Quoted from ACM / Infosys Press Release.
"NEW YORK and BANGALORE, INDIA, April 1, 2014 – ACM (the Association for Computing Machinery) and the Infosys Foundation announced today that David Blei is the recipient of the 2013 ACM-Infosys Foundation Award in the Computing Sciences. He initiated an approach to analyzing large collections of data using innovative statistical methods, known as “topic modeling,” that make it possible to organize and summarize digital archives at a scale that would be impossible by human annotation. His work is scalable to collections of billions of documents. It has inspired new research programs across multiple disciplines, with applications for email archives, natural language processing, information retrieval, computational biology, social networks, and robotics as well as computational social sciences and digital humanities.
The ACM-Infosys Foundation Award recognizes the finest recent innovations by young scientists and system developers in the computing field. An endowment from the Infosys Foundation provides financial support for the $175,000 annual award. ACM will present the ACM˗Infosys Foundation Award at its annual awards banquet on June 21 in San Francisco.
ACM President Vint Cerf said that Blei’s contributions provided a basic framework for an entire generation of researchers to develop statistical modeling approaches. “His topic modeling algorithms go beyond the search and links approach to information retrieval. In an era of explosive data on the Internet, he saw the advantage of discovering the latent themes that underlie documents, and identifying how each document exhibits these themes. In fact, he changed the way machine learning researchers think about modeling text and other objects in the digital realm.”
S. D. Shibulal, CEO and Managing Director, Infosys, said, “The innovative topic modeling method that David Blei has used to analyze data goes to show the capability in executing what was an unthinkable and mammoth task till a few years back. With ever-growing data generation, there is a simultaneous need to archive and interpret this data. Blei’s groundbreaking method has not only made this a simple task but will also help increase productivity significantly.”
Blei led the research that resulted in the simplest topic model, known as LDA (Latent Dirichlet Allocation). This statistical model provides a powerful tool for discovering and exploiting the hidden “topics” or semantic themes in the data. It is scalable to collections of billions of documents with thousands of themes, and applies equally well to images and biological sequences. Blei’s approach is based on a Bayesian framework (a mathematical method based on probability) that exploits hidden variables to draw out the latent thematic structure in the data. LDA has been characterized as the single most important method for analyzing large collections of data. He continues to expand the scope of topic modeling with powerful methods for simultaneously analyzing documents and user behavior.
In a landmark 2003 paper detailing the development of LDA, Blei and his co-authors Michael Jordan and Andrew Ng laid out their method for discovering patterns of word use and connection documents that exhibit similar patterns. The paper, Latent Dirichlet Allocation, has been frequently cited by the growing population of researchers in the topic modeling area."