WikiConference India 2011/Submissions/Creating Treebanks for Indian languages

Timestamp
15:25, 12 August 2011 (UTC)
Title of the submission

Developing Treebanks For Indian Languages

Type of submission (workshop, tutorial, or presentation)

Presentation

Author of the submission

Shahid Bhat

E-mail address or username (if username, please confirm email address in Special:Preferences)

shahid.bhat3@gmail.com

State of your origin (Country, if you are not based in India)

Jammu & Kashmir

Affiliation, if any (organization, company etc.)

Linguistic Data Consortium For Indian Languages-LDCIL

Personal homepage or blog

Ldcil.org

Abstract (maximum 500 words)

Tree-bank is a machine readable repository of syntactic structures of a language that predominantly serve as bank of training data for machine learning tasks to develop language technology tools like Syntactic Parser, POS Taggers, Morphological Analyzers. Such linguistic knowledge bank are not only crucial for developing NLP applications for technology development of resource poor Indian Languages but are also instrumental in carrying out other traditional linguistic and interdisciplinary studies like socio-linguistic, psycho-linguistic and cognitive science researches. Tree-banks can be augmented with any sort of linguistic or non-linguistic information to serve as real hub of knowledge of language, literature, culture and the society of this multi-cultural,multi-ethnic and multi-lingual country. Given the capacity of Tree-banks to hold such a vast amount of knowledge and their commercial utility to be consumed as language data for language technology, it will be fruitful to develop such wider scope tree-banks for Indian languages. Since, mission of Wikipedia is acquisition, storage and dissemination of Knowledge, I would like to call these wider scope tree-banks "Wikitreebaks" like Marathi-Wikibank, Urdu-Wikibank, Kashmiri-Wikibank and Hindi-Wikibank.

Track (Community/Knowledge/Outreach/Technology)

Knowledge/ Technology

Will you attend Wikiconference if your submission is not accepted?

Yes

Slides or further information (optional)