skip to main content
Department of Computer Science University of Colorado Boulder
cu: home | engineering | mycuinfo | about | cu a-z | search cu | contact cu cs: about | calendar | directory | catalog | schedules | mobile | contact cs
home · events · colloquia · 2011-2012 · 

Colloquium - McCallum

ECCR 265

Joint Inference and Probabilistic Databases for Large-Scale Knowledge-Base Construction
University of Massachusetts, Amherst
Andrew McCallum photo

Wikipedia's impact has been revolutionary. The collaboratively edited encyclopedia has transformed the way many people learn, browse new interests, share knowledge and make decisions. Its information is mainly represented in natural language text. However, for many tasks more structured information is useful because it better supports pattern analysis and decision-making. In this talk I will describe multiple research components useful for building a large, structured knowledge base, including information extraction and entity resolution, joint inference with conditional random fields, probabilistic databases to manage uncertainty at scale, robust reasoning about human edits, tight integration of probabilistic inference and parallel/distributed processing, and probabilistic programming languages for easy specification of complex graphical models. I will also discuss applications of these methods to scientometrics and a new publishing model for science research.

Andrew McCallum is a Professor and Director of the Information Extraction and Synthesis Laboratory in the Computer Science Department at University of Massachusetts Amherst. He has published over 200 papers in many areas of AI, including natural language processing, machine learning, data mining and reinforcement learning, and his work has received over 25,000 citations. He obtained his PhD from University of Rochester in 1995 with Dana Ballard and a postdoctoral fellowship from CMU with Tom Mitchell and Sebastian Thrun. In the early 2000's he was Vice President of Research and Development at WhizBang Labs, a 170-person start-up company that used machine learning for information extraction from the Web. He is a AAAI Fellow, the recipient of the UMass NSM Distinguished Research Award, the UMass Lilly Teaching Fellowship, and research awards from IBM, Microsoft and Google. He is the General Chair for the International Conference on Machine Learning (ICML) 2012, a member of the board of the International Machine Learning Society and the editorial board of the Journal of Machine Learning Research. For the past ten years, McCallum has been active in research on statistical machine learning applied to text, especially information extraction, co-reference, semi-supervised learning, topic models, and social network analysis. Work on search and bibliometric analysis of open-access research literature can be found at

Joint work with Michael Wick, Sameer Singh, Karl Schultz, Sebastian Riedel, Limin Yao, Brian Martin and Gerome Miklau.

The Department holds colloquia throughout the Fall and Spring semesters. These colloquia, open to the public, are typically held on Thursday afternoons, but sometimes occur at other times as well. If you would like to receive email notification of upcoming colloquia, subscribe to our Colloquia Mailing List. If you would like to schedule a colloquium, see Colloquium Scheduling.

Sign language interpreters are available upon request. Please contact Stephanie Morris at least five days prior to the colloquium.

See also:
Department of Computer Science
College of Engineering and Applied Science
University of Colorado Boulder
Boulder, CO 80309-0430 USA
Send email to

Engineering Center Office Tower
ECOT 717
FAX +1-303-492-2844
XHTML 1.0/CSS2 ©2012 Regents of the University of Colorado
Privacy · Legal · Trademarks
May 5, 2012 (13:29)