Wikimedia Diversity Conference 2013/Documentation/T Vishnu Vardhan

Session: Session: T Vishnu Vardhan // So many languages: Challenges and opportunities for Wikimedia movement in India edit

Abstract edit

Due to the large number of languages that exist in India, the country has more than 20 different Wikipedia language versions. However, most of them are neither significant in size nor in quality. The presentation looks at the challenges and opportunities the language diversity in India presents to the Indian Wikimedia movement. An attempt will also be made at speculating about possible models on how each language could also potentially contribute to other Wikipedia language versions. The presentation will critically look at the following questions: Could language diversity diversify and develop the Indian language version of Wikipedia? Is there really unity in diversity?

Starting point / Insights edit

Looking at the language diversity in India

  • Half of all modern languages may vanish by 1200 according to UNESCO report in 2010
  • Large languages also die (Latin, ancient Greek, Sanskrit)
  • 179 languages and 544 dialects in the Indian Empire; according to 1961 census reports, there's 1,652 mother tongues in India
    • 122 languages spoken by 10,000
    • Constitutions recognizes 22
    • 87 used in print media
    • 71 used in radio
    • 66 scripts (Konkani has 5!)
  • Many languages never put in script
    • 220 languages died in the last 50 years
    • 150 could die soon

Wikipedia in India

  • 22 languages
  • More than 20 Indian language Wikipedias in incubation
  • Page views in Indian language Wikipedias are increasing, but there's a major lack of articles and editors for these languages

Challenges and Opportunities

  • Orality of languages (mostly spoken languages)
  • Knowledge production in Indian languages is mostly in creative side; not critical writing being produced in languages. Bulk of knowledge state funded.
  • Aspiration of English education (youth interested in English more than Indian languages)
  • Lack of standards (research, etc.) compared to other Wikipedias
  • Translation challenges (inter-Indic hasn't happened)
  • Regional identity is associated to language (big cities like Delhi, Bombay, Hyderabad, or Bangalore have 300+ languages)

Technical challenges

  • Inadequate support for Indian scripts on digital environment (not just Wikipedia; all technology devices; lobbying with gov't to establish requirements)
  • Typing is a major challenge (too many letters and not enough keys when mapping to Latin keyboard) - video tutorials; Indic typing at school level to train students how to type
  • OCR - Indian Language scripts are too complex - lobbying with C-DAC (gov't agency)
  • Dearth of quality content avialable in digital format (looking at digitization of encyclopedias and releasing them under CC licenses)
  • Different standards/formats/generations
  • Insularity of existing niche communities (18-25 tech background folks; community doesn't diversify) -- bring thematic experts

Opportunities

  • Levaraging the three-language policy of education
  • Boom in higher education --number of Indian universities and colleges growing dramatically. Trying to harness student time. Trying to integrate Wikipedia into the pedagogy with Christ University Partnership (1,600 students, 5 Indic languages).
  • e.g., Lilavethi's Daughter project on women scientists in India: multilingual articles on smae topics
    • creating project structures, pool of resources so anyone can use them
    • building a sense of healthy competition between communities
  • sharing hte periodic metrics and analyses with the community

Some questions:

  • What happens to diversity of the 700-odd Indian languages
  • Doe we need to put energy into bringing them on Wikimedia projects?
  • Language diversity preservation needn't be tied to encyclopedic knowledge production

WikiSpeech - Project ideas

  • What to collect, how to collect, potential of mobile, viz and metadata, mediawiki and wiki models
  • pad.ma

Questions / Next steps recommendations edit

Q: Should Wikimedia be in the business of language preservation? A: Yes.

Q: Should we do oral citations? A: Yes.

Q: How to balance all opportunities vs actually achieving something? A: Trying by having a diverse group of languages to do pilots with and seeing the outcome of which works best.