4/26/2012 3:30pm-4:30pm ECCR 265
|
Joint Inference and Probabilistic Databases for Large-Scale Knowledge-Base Construction
University of Massachusetts, Amherst
Wikipedia's impact has been revolutionary. The collaboratively edited
encyclopedia has transformed the way many people learn, browse new interests,
share knowledge and make decisions. Its information is mainly represented in
natural language text. However, for many tasks more structured information is
useful because it better supports pattern analysis and decision-making. In this
talk I will describe multiple research components useful for building a large,
structured knowledge base, including information extraction and entity
resolution, joint inference with conditional random fields, probabilistic
databases to manage uncertainty at scale, robust reasoning about human edits,
tight integration of probabilistic inference and parallel/distributed
processing, and probabilistic programming languages for easy specification of
complex graphical models. I will also discuss applications of these methods to
scientometrics and a new publishing model for science research.
Andrew McCallum is a Professor and Director
of the Information Extraction and Synthesis Laboratory in the Computer Science
Department at University of Massachusetts Amherst. He has published over 200
papers in many areas of AI, including natural language processing, machine
learning, data mining and reinforcement learning, and his work has received
over 25,000 citations. He obtained his PhD from University of Rochester in 1995
with Dana Ballard and a postdoctoral fellowship from CMU with
Tom Mitchell and Sebastian Thrun.
In the early 2000's he was Vice President of Research and Development at
WhizBang Labs, a 170-person start-up company that used machine learning for
information extraction from the Web. He is a AAAI Fellow, the recipient of the
UMass NSM Distinguished Research Award, the UMass Lilly Teaching Fellowship,
and research awards from IBM, Microsoft and Google. He is the General Chair for
the International Conference on Machine Learning (ICML) 2012, a member of the
board of the International Machine Learning Society and the editorial board of
the Journal of Machine Learning Research. For the past ten years, McCallum has
been active in research on statistical machine learning applied to text,
especially information extraction, co-reference, semi-supervised learning,
topic models, and social network analysis. Work on search and bibliometric
analysis of open-access research literature can be found at
rexa.info.
Joint work with Michael Wick, Sameer Singh, Karl Schultz, Sebastian Riedel, Limin Yao, Brian Martin and Gerome Miklau.
|