Grants talk:IEG/Semi-automatically generate Categories for Vietnamese Wikipedia

Latest comment: 8 years ago by Tuanminh01 in topic Round 2 2015 decision

Alphama Converter 1.1.6 edit

I'm using the tool above, when I write an article based on an English article to help me find categories in Vietnamese. As German is my first foreign language, it's also welcome, if the tool can also do the same thing. Any help to make the tool more useful to save editors time, is a good idea. Thank you, Alphama, for your tool. DanGong (talk) 11:07, 27 September 2015 (UTC)Reply

Eligibility confirmed, round 2 2015 edit

 

This Individual Engagement Grant proposal is under review!

We've confirmed your proposal is eligible for round 2 2015 review. Please feel free to ask questions and make changes to this proposal as discussions continue during this community comments period.

The committee's formal review for round 2 2015 begins on 20 October 2015, and grants will be announced in December. See the schedule for more details.

Questions? Contact us.

Marti (WMF) (talk) 05:08, 4 October 2015 (UTC)Reply

Links to community notifications edit

Hey there, Alphama. Thanks for submitting your proposal to generate categories in smaller Wikipedia projects. I'm also glad to see community support backing this proposal on the basis of helping with certain translation tasks and reducing load on managing categories generally. I wanted to ask if you could provide any links under the Community notifications section showing where you notified communities on vi.wiki or elsewhere; this can be helpful not just for the grant committee, but for future applicants so they can know what spaces might be good to use to notify communities about their ideas. That said, if you notified individuals on their user talk pages, you can simply say that, you don't need to provide all of those individual links. Thanks, I JethroBT (WMF) (talk) 20:23, 7 October 2015 (UTC)Reply

I notified some Vietnamese Wikipedians at their talk pages. I also update the community notification section. Thank you. Alphama (talk) 02:52, 8 October 2015 (UTC)Reply

IEG Interview scheduling edit

Hello, Alphama,

I've tried to contact you a couple of times over the last week via the email address you provided in your proposal, but I haven't yet been able to get a hold of you. I'd like to schedule an interview with you for next week to ask you some questions about your project. I need you to provide a working email address as soon as you possibly can to mjohnson wikimedia.org so we can set up a time that works for both of us.

Warm regards,

--Marti (WMF) (talk) 22:47, 12 November 2015 (UTC)Reply

Aggregated feedback from the committee for Semi-automatically generate Categories for Vietnamese Wikipedia edit

Scoring criteria (see the rubric for background) Score
1=weak alignment 10=strong alignment
(A) Impact potential
  • Does it fit with Wikimedia's strategic priorities?
  • Does it have potential for online impact?
  • Can it be sustained, scaled, or adapted elsewhere after the grant ends?
6.6
(B) Innovation and learning
  • Does it take an Innovative approach to solving a key problem?
  • Is the potential impact greater than the risks?
  • Can we measure success?
6.2
(C) Ability to execute
  • Can the scope be accomplished in 6 months?
  • How realistic/efficient is the budget?
  • Do the participants have the necessary skills/experience?
6.4
(D) Community engagement
  • Does it have a specific target community and plan to engage it often?
  • Does it have community support?
  • Does it support diversity?
6.8
Comments from the committee:
  • I see a huge impact in the local community, but I have mixed feelings whether if this tool could improve the articles or attract new users to Vietnamese Wikipedia.
  • I am definitely a heavy category user, so I can imagine there is a need for something like this on small projects, but I am not convinced of the need. Sometimes I wish category trees were more flat. Categories can be so precise that they defeat the purpose.
  • I like the target of helping small languages. The author has produced tools before.
  • Very innovative, but shouldn't we be using Wikidata for this?
  • The program has to be developed.
  • The person has shown success with wiki projects. While I'm a little concerned that the desired roles haven't been filled, I think this is still likely to succeed.
  • The project has a huge support from the local Wikipedians, but I am unclear how this project may be applied to other Wikipedias.
  • People seem to support him and he has a history of working with the vi-wiki community on similar tools.
  • Budget seems a little high
  • I am not convinced of the need for this project.
  • I am unsure about funding a project in Vietnamese Wikipedia in which there are more than 1 million articles, all bot-generated, and a small community. A better balance is to have a well-engaged community helped by bots. Vietnamese has a sufficient number of speakers to aim to achieve more.

Open datasets edit

This is a great proposal. I'll be adding my endorsement after I save this post. But in the meantime, I'd like to make some requests for open datasets!

It seems like there's some good opportunities to release some open-licensed datasets as you (Alphama) do your work. E.g. I could imagine that your input categories from English Wikipedia and the resulting triples could be immensely valuable as a well-documented dataset. These could be used as input for applying the strategies to future wikis and as fodder for future work in the wiki research community.

I'd also like to see the English Wikipedia triples paired with decisions that were made during review by Alphama and other Vietnamese Wikipedians. Future algorithms could be trained and tested against such wellsprings of high quality human judgement.

Projects like this are complex and I suspect that Alphama is going to find some sneaky difficulties along the way. Releasing open datasets is a great way to hedge against the risk that his work is not immediately successful (within the bounds of the grant). If there are open datasets, I'll be able to market them and the difficult problems that Alphama discovers to the wiki research community for future work.

Would it be reasonable to add this to the project plan? I'd be happy to advise on how to get the datasets hosted/documented as well as to market the datasets to the wiki research community. --EpochFail (talk) 18:34, 1 December 2015 (UTC)Reply

Hi, thank for your comment. Datasets can be opened for research community. However, I don't know much about the license of these datasets, hope someone can help me. Alphama (talk) 01:59, 4 December 2015 (UTC)Reply

Round 2 2015 decision edit

 

Congratulations! Your proposal has been selected for an Individual Engagement Grant.

The committee has recommended this proposal and WMF has approved funding for the full amount of your request, $7,000

Comments regarding this decision:
As reflected by the endorsements, this proposal demonstrates strong backing from the Vietnamese Wikipedian community. The committee supports the community’s interest in introducing a category classification system for existing articles and looks forward to seeing open-licensed datasets produced along the way.

Next steps:

  1. You will be contacted to sign a grant agreement and setup a monthly check-in schedule.
  2. Review the information for grantees.
  3. Use the new buttons on your original proposal to create your project pages.
  4. Start work on your project!
Questions? Contact us.
Thank you. Alphama (talk) 04:50, 5 December 2015 (UTC)Reply
Congrats! Tuanminh01 (talk) 05:57, 9 December 2015 (UTC)Reply
Return to "IEG/Semi-automatically generate Categories for Vietnamese Wikipedia" page.