ITA Software's Technical Seminar Series

Talk Topic: Freebase - Google's Newly Acquired Semantic Database - Technical and Community Perspectives
Tom Morris, Freebase
September 1, 2010


Abstract

Freebase is a graph data store containing 330 million facts about 12.5 million concepts, including 1.6 million people, 3.4 million editions of 2.4 million books, and 106,000 movies. It participates in the Semantic Web/Linked Data universe by linking to other data stores like the BBC, New York Times, and IMDB. We'll discuss the both technology and community aspects of the ecosystem as well as the synergies they achieve. On the technology front we'll talk about the graph store, query language, schema, hosted server-side Javascript app development environment and its associated IDE as well as some of the data reconciliation and curation processes. On the community/social front, we'll discuss how the community collaborates to develop new schemas, create applications, and curate data in ways which assist the algorithms used for data reconciliation. From a strategic point of view we'll speculate briefly on where Freebase could go in the future based on where it is today and the strengths that it has.


Bio

Tom Morris is the top external data contributor to Freebase and has contributed more than 1.4 million facts. He's been a member of the Freebase community for several years and as an "expert" contributes to schema development, process refinement, community practices and data curation/quality activities. When not hacking on Freebase, Tom is an independent software engineering and product management consultant specializing in the intersection of business and technology, especially new flavors of either and the intellectual property and strategic issues associated with both. He leads the open source ArgoEclipse UML plugin project and is a 3-time Google Summer of Code mentor with extensive open source development experience.