Grants talk:Project/Developing and Enhancing Gurmukhi-Shahmukhi Machine Transliteration Service for Wikipedias

Active discussions

Language variantsEdit

@Tej74i: Hello! It's exciting to know that you are working to bring the two Panjabi Wikipedias closer together. Are you aware of the LanguageConverter tool which is used by some Wikipedias? For example, it is used by the Kazakh Wikipedia so that both Latin and Cyrillic scripts can be used to read and write articles, and by the Chinese Wikipedia so that both traditional and simplified characters can be used.

From what you have written, it seems that the situation of Gurmukhi and Shahmukhi is more complex because of the missing short vowels, but you should consider whether your work could be made easier by using the LanguageConverter system. If this seems interesting to you, I can try to connect you with the Wikimedia Foundation staff members who know more about this topic.—Neil P. Quinn-WMF (talk) 21:32, 3 March 2017 (UTC)

@Neil P. Quinn-WMF: Hello! LanguageConverter tool is a good option to work with. But we have to find out ways how over system will integrate with this language converter.

Eligibility confirmed, round 1 2017Edit

This Project Grants proposal is under review!

We've confirmed your proposal is eligible for round 1 2017 review. Please feel free to ask questions and make changes to this proposal as discussions continue during the community comments period, through the end of 4 April 2017.

The committee's formal review for round 1 2017 begins on 5 April 2017, and grants will be announced 19 May. See the schedule for more details.

Questions? Contact us.

--Marti (WMF) (talk) 19:53, 27 March 2017 (UTC)

Dear tej74i,

Thank you for submitting this project. I have three pieces of feedback:

  1. Since the changes you are proposing would be significant for the communities affected, it is very important that you seek feedback from Punjabi speaking communities here on your talkpage. Is this what these communities want? If so, ask them to comment and describe the needs that your project meets for them. If they have concerns, ask them to post about those concerns. We will want to hear their opinions about this project and whether they see this as an effective strategy for addressing their concerns. If you like, you can reach out to I_JethroBT_(WMF), our community organizer, for ideas about how to solicit feedback.
  2. Also, please provide more details about what the “Wikimedia Service” will be, technically speaking. Is it a bot? An extension? We need you to be much more specific about what you intend to build and how you will build it, so we can evaluate the he feasibility of the project.
  3. Please say more about your experience in MediaWiki development.

Kind regards, --Marti (WMF) (talk) 02:23, 3 April 2017 (UTC)

Dear Marti (WMF), Sorry for the delay!

  1. We am new users to MediaWiki development. This idea came out after meeting and discussion with wiki team on August 2016 at Chandigarh
  2. “Wikimedia Service” will be a web Service available to convert wiki transliteration requests (for Gurmukhi-Shahmukhi in Unicode Text) and will be hosted by Wikipedia or by us.
  3. kindly see the discussion where Punjabi people want conversion between the two scripts and now its possible through our system integration with Wikimedia.

best regards

Round 1 2017 decisionEdit


This project has not been selected for a Project Grant at this time.

We love that you took the chance to creatively improve the Wikimedia movement. The committee has reviewed this proposal and not recommended it for funding, but we hope you'll continue to engage in the program. Please drop by the IdeaLab to share and refine future ideas!

Next steps:

  1. Visit the IdeaLab to continue developing this idea and share any new ideas you may have.
  2. Applicants whose proposals are declined are welcome to consider resubmitting your application again in a future round. We ask that you first email projectgrants wikimedia · org to indicate your interest in resubmission so staff can review any concerns with your proposal that contributed to a decline decision, and help you determine whether resubmission makes sense for your proposal.
  3. Check back at the schedule for information about the next open call to submit proposals.

Questions? Contact us.

Aggregated feedback from the committee for Developing and Enhancing Gurmukhi-Shahmukhi Machine Transliteration Service for WikipediasEdit

Scoring rubric Score
(A) Impact potential
  • Does it have the potential to increase gender diversity in Wikimedia projects, either in terms of content, contributors, or both?
  • Does it have the potential for online impact?
  • Can it be sustained, scaled, or adapted elsewhere after the grant ends?
(B) Community engagement
  • Does it have a specific target community and plan to engage it often?
  • Does it have community support?
(C) Ability to execute
  • Can the scope be accomplished in the proposed timeframe?
  • Is the budget realistic/efficient ?
  • Do the participants have the necessary skills/experience?
(D) Measures of success
  • Are there both quantitative and qualitative measures of success?
  • Are they realistic?
  • Can they be measured?
Additional comments from the Committee:
  • It could be scaled to other languages but with a large previous study.
  • The impact is not clear, outcomes and impact are not evaluated.
  • Providing knowledge to everyone in a language they speak and read is our mission.
  • This is a perfect fit into the “infrastructure” strategic priority which will also help readers and editors. It should have a high impact on the Punjabi community.
  • At the moment it seems to solve a regional and limited problem. Given the high diversities of Indian languages, more analysis of how to share solutions has been needed.
  • Lots of other languages face the issue of multiple scripts: it is also, for instance, the case of native languages in North Africa, where three scripts exist. These languages have been staying in the incubator for years and have no official Wikipedia yet, because the issue of "which script to use" cluster the (already small) community. Lesson learned from an automatic scripting translation, in terms of community dynamics, would be very useful.
  • I think that risks are very high compared to the potential impact. 97% accuracy is not at all suitable for Wikipedia, anything different from 100% cannot be implemented in production (no one needs an encyclopaedia with 3% of mistakes).
  • As a software developer, the list of activities is "etherous", because the grantee hasn't sufficient expertise to write about the Mediawiki development. I've worked previously on other transliteration projects and the process isn't too traumatic. So, I don't see the sufficient expertise with Mediawiki core to tell us about the project.
  • Not clear.
  • Minimal MW experience.
  • Budget too high for the administration costs.
  • Participants seem to have no experience with MediaWiki development, which will be crucial for the success of the project. The budget is vague.
  • Lack of interest and support. This project is interesting and useful for the community, but without support, I see it as a one-shot project.
  • Not clear.
  • Not aware of similar pre-existing tools.
  • No clear community support.
  • Such a proposal needs support, or at least a discussion with the concerned community, while there was not even an attempt to do it.
  • The grantee was not promptly responsive to questions and I don't see sufficient expertise with Mediawiki to support funding the project. There is a lot of good will, but as a project there are some problems to solve: the budget isn't well documented, no community message (or at least, the grantee doesn't indicate this) and lack of support of a long-time wikipedian from the projects.
  • Asking more than 10.000$ probably would require more effort to write a proposal. The weak proposal and the short description, limited to a budget and a bio, is a very limited effort in my opinion.
  • Too much for the administration costs of that part of the world.
  • No clear community support.
  • This project is not worth funding in the current form for a number of reasons: in order to be successful, this project should have accuracy at least expected at 100%, stronger team involving people with MediaWiki experience and it has to be discussed with the concerned Punjabi communities.

Transliteration AccuracyEdit

Like all other Arabic script based text, Shahmukhi text is usually written without short vowels. There are Character and Word level ambiguities that need to addressed while transliterating the text. So achieving 100% transliteration accuracy is Practically Not Possible. I would like to hear any MT-system who is working at 100% accuracy. Still we have been able to achieve 97% word level accuracy, which is a very good accuracy and we shall work to further increase the accuracy in the new project, if granted. Even the 3% words with error, majority of them do not have any serious transliteration error except one of two characters in the word getting wrongly transliterated Now take a practical example of transliteration that we all familiar with, Google Input Tool gives you many options for a single input Why not Top One? Because there is an ambiguity. Hope, My point is clear.

Return to "Project/Developing and Enhancing Gurmukhi-Shahmukhi Machine Transliteration Service for Wikipedias" page.