Research:Named Entity Disambiguation
This page documents a planned research project.
Information may be incomplete and change before the project starts.
Key PersonnelEdit
- Erdal Kuzey
Project SummaryEdit
An application which can extract named entities from plain text and maps them to Wikipedia articles.
An example is as follows:
Input Text: Obama had an interesting discussion with Merkel during 37th G8 Summit.
Output Text:Obama had an interesting discussion with Merkel during 37th G8 Summit
MethodsEdit
Many machine learning techniques and engineering methods will be used in order to develop an entity disambiguation application. Currently, redirect pages, anchor texts and disambiguation pages are extensively used in prototype application. However, more data is needed to have better coverage and precision. For this reason, query logs of Wikipedia is strongly needed. What meant by a query log is the search terms that users issue to Wikipedia search box and the results returned by Wikipedia Search Engine. And if there is a good match, then user will click one of the results. Thus, a query log is intended to have a structure like as the triple (QUERY--> RESULTSET---> CLICKED RESULT). Each triple is one of many entries in the query log.
DisseminationEdit
All outcome of this project will be published and publicly available. Wikipedia Research team and Wikipedia will be referenced and acknowledged in the publication of this work.
Wikimedia Policies, Ethics, and Human Subjects ProtectionEdit
No private data will be published. so, the project is believed to be ethical and scientific.
Benefits for the Wikimedia communityEdit
Apart from scholarly publications of this work, a webservice will be released in order to automatically annotate plain texts. So, that wikipedia free texts can be annotated with Wikipedia interlinks.
Time LineEdit
FundingEdit
The project is supported by a PhD grant given by Max Planck Institute for Informatics.
ReferencesEdit
A similar work done by my colleagues is: http://www.mpi-inf.mpg.de/yago-naga/aida/ A demo version is also available in the site above.
External linksEdit
ContactsEdit
Erdal Kuzey ekuzey (at) mpi-inf (dot) mpg (dot) de