Bengali Wikipedia 10th Anniversary Celebration Kolkata/Submissions/Bringing Entity-specific information from Wikipedia sisters under one umbrella

This is an accepted submission for Bengali Wikipedia 10th Anniversary Conference. It has been accepted as a poster and the presentation time is Saturday, 10th January, 14:00 - 16:00 hours in front of Dr. K.P. Basu Memorial Hall

Submission no.

Title of the submission: Bringing Entity-specific information from Wikipedia sisters under one umbrella

Type of submission (discussion, hot seat, panel, presentation, tutorial, workshop)

Author of the submission: Siv Ghosh

E-mail address

Username

Country of origin: India

Affiliation, if any (organisation, company etc.)

Personal homepage or blog

Abstract (at least 300 words to describe your proposal)

Each Wikipedia article is centered on one entity. The mode of information retrieval is page/document retrieval. While searching for entity-specific information on the web people first tend to refer to Wikipedia. Now, different type of information related to a specific entity may be present under different Wikipedia sister projects. It may be time-killing for the user to open a new tab to search for information about an entity present in different Wikipedia sister projects. So, we propose better accessibility by gathering information present on a specific-entity from different Wikipedia sister projects. When the user searches for information about an entity, the results will be a ranked-list of links to different pages of Wikipedia sister projects where information about that entity is present. We have chosen the (film domain) for application. This can be extended to other domains and the search technique can be applied in the main page of Wikipedia. The aim is to enable direct retrieval of Wikipedia page/document about an entity. Our methodology includes Identification of information sources (i.e. Wikipedia page/document specific to the name of entity mentioned in user’s query). The URI of the resource will be appended with the namespace each time user makes a query. As, a result of this the desired Wikipedia page/document will be retrieved, on the fly approach. The source of knowledge extraction method is direct and will be stored in database tables. The interface currently supports only General Entity Retrieval Query (GERQ) i.e. it will retrieve only those Wikipedia page/document that match the specified entity. Open Source Software has been used throughout the procedure. An Entity can be regarded as things of all kinds that can be classified into different ‘classes’ and that have certain ‘attributes’. Our future work is based on resolving two issues - each entity is linked with other entities via semantic relations and each Wikipedia page/document is dynamic. For linking with other semantically related entities we will derive knowledge using the entities, categories, attributes, and facts extracted. A "fact" related to an entity refers to a tuple having the form < entity, attribute, value >. Here value corresponds to an entity. The issue that each Wikipedia page/document is dynamic will be resolved by the on the fly approach (as done by DBpedia Live). By this approach, data is gathered from the source with the help of a wrapper and is stored in a semantic web compliant form.

Track: Technology, Interface & Infrastructure

Language of Track: English

Length of session (if other than 30 minutes, specify how long)

Will you attend Conference at Kolkata with own cost if your submission is not accepted?

Slides or further information (optional)

Special requests

Interested attendees

If you are interested in attending this session, please sign with your username below. This will help reviewers to decide which sessions are of high interest. Sign with a hash and four tildes. (# ~~~~).

Add your username here.