Grants talk:PEG/Interglider.ORG/Wiktionary Meets Matica Srpska

Latest comment: 9 years ago by AWang (WMF) in topic Grant Approved

GAC edit

GAC members who support this request edit

  1. In general it's Ok considering also the split of the financial support. Too much complex to be understood easily and too much deliveries in my opinion, but in general it's good. --Ilario (talk) 14:16, 19 January 2015 (UTC)Reply

GAC members who oppose this request edit

GAC members who abstain from voting/comment edit

GAC comments edit

Dry Martini's questions edit

  • I believe this idea has a lot of potential, mainly because of contacts with venerable institutions, the mass of free content that could flow into various project, the potential long term benefits in software development for MediaWiki. However, I'm afraid some of the benefits could be lost or misdirected:
    1. First of all, I'm not sure I understood what this Vojvodina languages dictionary is: is it a multilingual dictionary (like a French-English dictionary would be) or something else?
    2. I'd like to know more about this Serbian ornithological dictionary? Isn't it more of an encyclopedia? And if so, wouldn't it be best to spread the benefits of this projects to Wikipedia and/or Wikispecies? Surely Wiktionary would benefit a lot from the increased traffic, but maybe two (or three different projects would have a wider effect (a shock on a single project is not always advisable, I'll get to that later): what if you focused the Vojvodina dictionary on Wiktionary and you spread the benefits of the ornithological dictionary among Serbian Wikipedia and Wikispecies?
    3. Now about volumes: we are talking about a huge community impact, which the community itself could not be able (or willing) to withstand. What I don't see in the proposal is an analysis of the "aftermath" of your work: do you think Serbian Wiktionary can handle that mass of entries after they have been entered? Will there be enough patrollers, admins, users to maintain properly such a huge amount of info? Also consider that communities don't always welcome massive inputs: do you have community consensus for this kind of work? Back to my previous suggestion: spread the impact (and benefits) of your idea, envolve other editors from other wikis; this way you'll ensure none of your efforts (and of WMF money: not nice to put it this way, but isn't it what we are here for, after all?) get lost.
      1. How confident are you about the responsiveness of Serbian community? Do you have direct experiences that suggest they will provide enough feedback and cooperation? Keep in mind that overall participation in the Wikimedia movement is declining, so a success from 3 years ago migth not be the right thing to count on.
    4. I didn't quite grasp how the programmer would fit in the picture: is the programmer supposed to enter the entries himself? He should develop some programs, if I'm not mistaken, which would help to further develop Wiktionary, isn't it? Can you produce a more detailed explanation of what this programs would be and how could they help Wiktionary? Have you considered involving any MediaWiki developer?
    5. I know you've already been asked this, but... It's a LOT of entries, are you sure you can enter them all, all by yourselves? If you can (and I don't doubt it, just asking), I'd love to have you guys at it.wikipedia :P
    6. What do you mean by community manager?
    7. [added] Hosting and administration, budget line 7. What did you include in there?
You have done a good job with this proposal, really. Thank you for your answers :) --Dry Martini (talk) 23:59, 26 January 2015 (UTC)Reply
Response by Milos edit

Dry Martini, thanks for your questions and input :) Here are my responses:

  1. The Dictionary of Serbian Speeches of Vojvodina is a dictionary of Serbian dialects of Vojvodina. Those dialects are very close to standard, but they have their specific (~30,000) words. I suppose that the good parallel in the case of Italian would be a dictionary of Italian speeches of Tuscany (which is, AFAIK, the dialect basis for the contemporary Italian standard). It's not like Neapolitan-Italian dictionary, which is too distant of Italian standard. --Millosh (talk) 15:56, 27 January 2015 (UTC)Reply
  2. Serbian Ornithological Dictionary is a dictionary :) It's not an encyclopedic dictionary, it's a dictionary :) It goes like: "<latin name for the species>", <primary serbian name for the species> <grammatical and lexicographical attributes for that word>, <serbian name 2 for the species> ..., <serbian name 3 for the species>..." + illustration. Thus, it's not that useful for one encyclopedia, but it could be partially useful for both Wikipedia and Wikispecies. We said inside of the project that it's one of our goals to enrich Commons (with illustrations) and likely Wikispecies and Wikipedia. However, that's a kind of secondary thing, but, again, we'll do that. Note also that our idea is to spread benefits of this dictionary not just to Serbian Wikipedia, but to all Wikimedia languages, which primarily means Wiktionaries, but could be used for Wikipedias, as well. --Millosh (talk) 15:56, 27 January 2015 (UTC)Reply
  3. In relation to your third question, there a couple of issues:
    1. Serbian Wiktionary, along with the all other projects (maybe with the exception of Wikinews at some point of time) is, in the sense of community, an offshoot of Serbian Wikipedia community. That's usual case for other-than-Wikipedia projects in the most of Wikimedia languages. Thus, it's maintained "form outsides", "from Wikipedia". There are two ongoing projects (I think 1001 Arabic Word lasts for almost two years), but it can't be treated as an independent community, separate from Serbian Wikipedia community. --Millosh (talk) 15:56, 27 January 2015 (UTC)Reply
    2. In relation to that, we actually need that Serbian Wiktionary and Wikipedia community doesn't object what we are doing and more than a month passed since we informed them about the project, which passed without any objection. --Millosh (talk) 15:56, 27 January 2015 (UTC)Reply
    3. Unlike structural work, maintenance is not that big issue. A number of Serbian Wikipedia admins are regularly checking other projects and the amount of their work on Wikipedia will always be much bigger than on any of the side projects. But, there are, of course, other bits... --Millosh (talk) 15:56, 27 January 2015 (UTC)Reply
    4. As I already told to Tony, the project itself is so complex, that any other addition of our ideas would make the project completely unreadable. Yes, we have "aftermath" plans. This should be the beginning of cooperation with Matica srpska, but also triggering more organized work behind all Wiktionaries. If it goes like planned, the minimum would be that we would actually maintain Serbian Wiktionary (but not just Serbian Wiktionary) for years. If it goes a little bit better, we'd trigger organizing self-sustainable Serbian Wiktionary community. If it goes even better, we'd trigger organizing more self-sustainable Wiktionary communities. --Millosh (talk) 15:56, 27 January 2015 (UTC)Reply
    5. In relation to the last point, we plan this project as something started with the content in Serbian, not as something which should end there. The idea is to do and trigger similar actions in other Wiktionaries (and, possibly, other Wikimedia projects). So, yes, we are thinking not just in the sense of our input to Serbian Wiktionary. --Millosh (talk) 15:56, 27 January 2015 (UTC)Reply
    6. In relation to your 3.1 point, here is the background... Yes, I am aware for around 8 years that the participation into Wikimedia projects is declining. And I am aware that we need big projects to activate community. And I want to do that through the projects which could have primary benefits which are open enough to create possibility for much larger impact, like this project is. --Millosh (talk) 15:56, 27 January 2015 (UTC)Reply
  4. There are two parts of the work related to programming. One will be dominantly handled by the paid programmer, the other one will be dominantly handled by me:
    1. The main part of inserting one dictionary into Wiktionary (and more structured form of dictionary) is to parse those dictionaries and prepare them for inserting into Wiktionary and more structured database. I was doing that professionally, I know how intensive that work is and I don't have time (and willingness) to work on that. Thus paid programmer. That part of the work will be done in Python (and inserted into Wiktionaries via Pywikipediabot). There are parts of that program which could be used in the future and those which couldn't. Actual parsing won't be that useful part, as every dictionary has its own formatting, and parsing it could be generalized, but I am doubtful how useful it would be in the case of the future similar projects. However, creating programming classes which deal with categorization, tagging, as well as with the general concepts of translating a dictionary into the database-friendly data -- that will be for sure useful for the people dealing with dictionaries in the future. --Millosh (talk) 15:56, 27 January 2015 (UTC)Reply
    2. My part of the work is related to the "online lexicographical tool", which would be, in fact, an online multilingual dictionary. The closest software to that one is OmegaWiki (we'll actually have the implementation very similar to OmegaWiki). However, OmegaWiki is a software one decade old. The other option was to go with Wikidata. However, Wikidata's internal dynamics is not synchronized with our need to finish this task in one year. At the other side, Drupal core (thus, without any non-essential module) -- unlike MediaWiki -- has all necessary logic to create multilingual dictionary without a line of code. Thus, we'll go with Drupal to create multilingual dictionary. In the future, it will be possible to create an interface from Drupal to MediaWiki, so we could use that dictionary from MediaWiki shell. --Millosh (talk) 15:56, 27 January 2015 (UTC)Reply
  5. Yes, we'll do that by ourselves. I am familiar with mass content adding (I see I created that page in 2006 :) ). I know which problems could occur etc. Everything will be added according to the standards of particular communities. And, yes, we'd be glad to help Italian Wikipedia :) Just keep in mind: adding particular content is not a big issue. The issue is preparing that content for adding it into the projects. --Millosh (talk) 15:56, 27 January 2015 (UTC)Reply
  6. (I suppose that you've realized that community manager and hosting and administration is something which would be the part of the project, but not paid by WMF.) Community managers should deal with various communities: to communicate, to support programmer and the rest of the team in relation to the needs of particular communities etc. As a former steward, I know that the most complex task in any multilingual project (like this one is) is to organize work with various communities. One part of my idea related to this project is to build human capacities for the future multilingual projects. In other words, I want to train a number of persons to become capable in dealing and understanding multilingualism and multiculturalism by gathering experience in real projects. I think that Wikimedia movement has serious lack of that kind of profile. --Millosh (talk) 15:56, 27 January 2015 (UTC)Reply

And thank you again for asking good questions :) --Millosh (talk) 15:56, 27 January 2015 (UTC)Reply

Thank you for your answers. I was going to add a point that Alex brought out, the part about participation and community involvement, but I guess I'll be just fine with your answer to Alex's question. --Dry Martini (talk) 20:39, 27 January 2015 (UTC)Reply
Hey, I answered to Alex's questions some time ago. But in brief: It couldn't be said that the community around Serbian Wiktionary exists. There is active and significant community around Serbian Wikipedia, but if they have some privileges on Serbian Wiktionary, it's mostly about basic maintenance. It is good that people working on Serbian Wiktionary don't object to the content addition, but although I could do it, it's not reasonable to artificially show that there is genuine intehttps://meta.wikimedia.org/wiki/Grants_talk:Changes_to_your_grantrest among Serbian Wikipedians for Wiktionary if it doesn't exist. At the other side, one of the side effects of this project would be definitely raising visibility of Wiktionary inside of Serbia (we'll have joint press releases with Matica srpska) and thus the number of Serbian Wiktionary editors; with likelihood to create a real community around it. --Millosh (talk) 23:58, 27 January 2015 (UTC)Reply

Community comments edit

Taxonomic entries for birds edit

English Wiktionary seeks to have taxonomic entries for all taxa that have attestable common names in any language (and for attestable higher taxa). Accordingly we would welcome access to a list of the 370 taxonomic names of birds. We probably already have some of them, but not all. I would be happy to make it a priority to make high-quality entries for all 370 and for all the English vernacular names for those taxa. DCDuring (talk) 16:43, 8 January 2015 (UTC)Reply

Thank you very much! Correct English names for the bird species are crucially important, as others would likely rely much more on English than on Latin names. --Millosh (talk) 11:06, 9 January 2015 (UTC)Reply

Complexity edit

This proposal page is the most complex I've ever seen. Many things are mentioned but not explained. Just a few examples:

Thanks Tony for the questions! The project itself is indeed complex and it wasn't feasible to explain everything on the project page. (As a person in charge for formalizing proposal itself, Milica stopped me adding new things before putting the proposal in draft form and opening it, both.) You've actually noticed very well that there are many things not explained, but if we tried to explain everything, the page itself would be much more complex. --Millosh (talk) 07:29, 12 January 2015 (UTC)Reply
The complexity of the project goes at least into the next directions:
  • Cooperation with Matica srpska itself opens to WMF and WMRS significant field. It is likely that a number of formal meetings would be required (Jan-Bart as chair, Lila as ED and Filip as president of WMRS), as well as discussions about further cooperation. That's not inside of the "shorter version" of the project, described on the project page.
  • Potential for cooperation with other similar institutions is even more complex. This project could easily trigger similar cooperation with numerous similar institutions from Slavic countries. Slovakian Matica is the first on the list, as the relations between Matica srpska and Matica slovenska are very close.
  • The number of entries is likely the most complex part to explain. If we go plainly with the number of entries from the dictionaries and only for Serbian Wiktionary, it's around ~35,000. However, both dictionaries have their own sides thanks to which we could make the number of entries much larger:
    • Dictionary of Serbian speeches of Vojvodina is the dictionary of non-standard varieties with 150,000 "secondary references", which are, in fact, standard words. The dictionary is of such form that we could extract all of them, thus making the number of words just for that dictionary 180,000 for Serbian Wiktionary (~30,000 is the number of entries inside of that dictionary). I think it's quite feasible to do that, but we didn't want to say that we'd make Serbian Wiktionary larger 10 times just thanks to that dictionary.
    • The contribution of Serbian ornithological dictionary heavily depends on Wiktionary communities (not just Serbian Wiktionary community). If we have 100 translations for the bird species, counting the various names for the birds in Serbian, we'd have 40,000 entries per Wiktionary (making it 4,000,000 in total). The similar number of entries will be added even before we start with adding that dictionary, as we need lexicographical terminology to be translated. If there are 100 lexicographical terms (like "noun", "n", "verb", "feminine", "f" etc.), the net contribution to 100 Wiktionaries would be 1,000,000.
  • Software itself is a complex issue. We didn't want to explain what we should do with "the websites" in detail. It could be easily possible that the websites themselves would be major added value. They should consist the logic necessary to handle one multilingual dictionary, including easy export to Wiktionaries, as well as good communication with the parsing software. That's the part on which I will work most intensively. --Millosh (talk) 07:29, 12 January 2015 (UTC)Reply
  • If funded commercially -- i.e. if we had to buy everything --, this project could easily go to $1M budget. Neither Matica or Interglider put the prices in commercial manner. Personally, I wouldn't do this job commercially bellow ~$100k. From one perspective, it's useless to give such comparisons entry-by-entry, from the other side, you are right when you see that something is missing. The amount of work behind this project is much larger than it could be seen from the budget itself. --Millosh (talk) 07:29, 12 January 2015 (UTC)Reply

Who funds and runs Matica srpska? The en.WP article is not clear on that.

Matica srpska is recognized as strategically important institution for Serbia. There is Matica Srpska Law (the page on Ministry for Culture, MS Word doc containing the law; both in Serbian). It is short and it says that Matica srpska is funded by Serbian government and by other means. --Millosh (talk) 07:29, 12 January 2015 (UTC)Reply
Besides Matica Srpska Law, there is Serbian Encyclopedia Law (PDF of the law, in Serbian), which defines that Matica srpska and Serbian Academy of Sciences and Arts are the holders of the creation of the encyclopedia. As Matica srpska is the primary encyclopedistics institution of Serbia, it actually organizes that work. --Millosh (talk) 07:29, 12 January 2015 (UTC)Reply

Translate from what language to what?

In the terms of Serbian ornithological dictionary, the dictionary itself is ordered by the Latin names of the species. Thus, it wouldn't be about translating from Serbian, but regular taxonomical work. DCDuring said that he is willing to contribute proper English names of the species, which means that the most other languages would translate the names from English. --Millosh (talk) 07:29, 12 January 2015 (UTC)Reply

I wasn't till I found the en.WP article on "Vojvodina" (not linked in the proposal) that I realised that "the speeches of Vojvodina" doesn't refer to public addresses by a politician. Later, "dialects" is used; is that the same as "speeches"? How many are there? How different are they? How many native speakers are there? They're not mentioned by name. I know this is a prototypical project, but bird names in these dialects seems rather abstruse. How many bird species are we talking about? Are they unique to this province?

Vojvodina is the northern region of Serbia inhabited by approximately two millions of people. (Not looking into the data...) I think that between 70% and 80% are native Serbian speakers, while likely 95% are at least bilinguals. Vojvodina has long history of colonization and thus variety of dialects are spoken there. The primary (and unique) dialects are Sumadija-Vojvodina and Smederevo-Vrsac. Sumadija-Vojvodina dialect is one of the two dialect basis for the standard Serbian language (the other being East Herzegovina dialect). --Millosh (talk) 07:29, 12 January 2015 (UTC)Reply
Keep in mind that we are talking about two dictionaries. One is Serbian ornithological dictionary, which includes proper Serbian names for ~370 species (we have approximation of 3,000-5,000 names in Serbian at the moment). Some of the species are endemic for Serbia, but the most of them belong to the wider European and Mediterranean pools. --Millosh (talk) 07:29, 12 January 2015 (UTC)Reply
The other dictionary is the Dictionary of Serbian speeches of Vojvodina. That dictionary is a proper dictionary of Vojvodina dialects/speeches and has ~30,000 entries. --Millosh (talk) 07:29, 12 January 2015 (UTC)Reply

Initial phase (pre-grant): "Establishing contact with Matica Srpska"—that hasn't been done already? There are five bullets, and it's unclear what has been done and what has not.

We are in the initial phase. It's mentioned that it's about pre-grant activities. The only thing which we didn't already do is the list of lexicographical terms to be translated into other languages (again, from English). --Millosh (talk) 07:29, 12 January 2015 (UTC)Reply

Matica Srpska: it just says "staff". We need some idea of how many staff and who (positions if we're being private about names, although specifics are usually an advantage when asking for funding).

That's complex issue when you are talking about that kind of institution. For sure, we have one person to work with us (he is the author of the Ornithological dictionary, as well as principal editor of the Vojvodina speeches dictionary). However, I didn't yet clear with them about mentioning him inside of the project. We'll do that during the next meeting, likely in February. At the other side, we will have wider support of Matica srpska, which includes all kinds of necessary technical support. In other words, we won't cooperate just with one person, but with as many as necessary. That's the tricky part of "staff". --Millosh (talk) 07:29, 12 January 2015 (UTC)Reply

Nothing said about your co-funding partner.

When you said "Nothing said about your co-funding partner." -- did you mean Milica? If so, I'll leave her to say a bit more about herself. (BTW, Google search on "Milica Gudovic" is giving the results about her.) --Millosh (talk) 07:29, 12 January 2015 (UTC)Reply

It's less than $2 an entry, by my quick calculation. Does that seem about right?

In relation to the price per entry: if we strictly compare ~35,000 entries with ~$40,000 budget, it is less than $2 per entry. However, the future benefits are even lowering the amount per entry. If we count 185,000 easily achievable entries, it's ~$0.25 per entry. If we count 5,000,000 entries, that would make the price less than $0.01 per entry. (BTW, this is also the part of the project complexity.) --Millosh (talk) 07:29, 12 January 2015 (UTC)Reply

I'm concerned about how the Wiktionary community might be involved (or not involved); and I'm concerned about the "risk" that it may not happen. How many active editors are there? Do you have active, functional links with Wiktionary editors in other languages?

To make the core of the project working, we need actually very small level of participation of Serbian and Serbo-Croatian Wiktionary communities. Basically, we need from them to agree that they want those dictionaries and cooperate on low level with us (I am a kind of member of those communities as well and I am capable to create the framework for work on Serbian Wiktionary). That's for the ~35,000 to ~185,000 entries for Serbian and Serbo-Croatian Wiktionaries. If we want to reach the 5,000,000 target, we'd need participation of other Wiktionary communities. Thus, I don't think that there is any real issue in relation to achieving the basic goals. --Millosh (talk) 07:29, 12 January 2015 (UTC)Reply
We've been discussing a lot the issue of participation of Wiktionary communities. In order to prevent low level of participation we will 'poke' editors personally, asking them to participate, introducing them to our concept, potential benefits and so on, doing everything in our power to motivate as much communitis as possible. This is important activity within the project and we are not planning to leave it to the chance, but to be very active in initiating and maintaining discussion with Wiktionary communities. --Godzzzilica (talk) 11:13, 12 January 2015 (UTC)Reply

Most of the extremely detailed and lengthy justifications in relation to the highly bureaucratic "global metrics" we're stuck with are not helping (Milos, this is not a criticism of you, but of the WMF's system). I find a lot of vagueness in the points that could not pin things down when it comes to reading a later report. Alex, do you really intend this amount of detail? I'd rather have just a few key ones, but with grit—that is, detail that can be tested against outcomes.

I am thankful to Milica for her work on project proposal. She applied all WMF guidelines and she will work on reporting. To be honest, the complexity of the project itself requires complex tools to analyze the products. If you have some particular question to Milica, I am sure she will respond to you in detail. --Millosh (talk) 07:29, 12 January 2015 (UTC)Reply
While preparing to write this porposal, I have been checking a lot of WMF documents. The "global metrics" were emphasized as a must, so we did our best in incorporating that evaluation methodology. We wanted to be sure that we are on the right track regarding evaluation and monitoring from the very begining, especilly having in mind the complexity of the project. We were also aware of the fact that it might burden the text of the proposal itself, but we do believe that actually - in the reports - these information would make sense and provide additional clarity in assessing project's success. --Godzzzilica (talk) 11:13, 12 January 2015 (UTC)Reply

If you wish, please interleave your answers above for convenience, indented. Tony (talk) 08:32, 11 January 2015 (UTC)Reply

Comments by Wikimedia organizations edit

Wikimedia Serbia edit

Connections to Wikimedia Serbia edit

Seeing as this grant proposal is related to content in Serbian language and the activities are planned to be conducted in Serbia, it is natural to wonder what the relation to Wikimedia Serbia is. As I haven't found any mention of the chapter in this grant proposal (apart from the side mention that Miloš used to be the president of the organization), I thought it would be good to give a quick background from the side of WMRS. So, we were in contact with Matica Srpska during the summer of 2014 (there was one meeting with them in Novi Sad, where Miloš and WMRS employees were representing WMRS), and they submitted two project proposals for our Annual plan 2015. The WMRS board wasn't really satisfied with the proposals and we tried to organize a meeting with Matica Srpska representatives before the deadline of September 30 (for the APG grants), but that didn't turn out to be possible. Therefore, we had to exclude their projects from our Annual plan, thus temporarily halting the cooperation. Miloš has informed the WMRS board several weeks ago that he plans on submitting this proposal and there have been no responses in that regard, which I would take to mean that there is no objection to this on the part of Wikimedia Serbia. --FiliP ██ 17:00, 7 January 2015 (UTC)Reply

Thank you very much on input regarding WMRS position. We see this project as added value to WMRS activities and as we have mentioned in the letter you refer to, we hope that WMRS will take part in it in various ways - through direct involvement in project activities, but also in future promotion and development. We believe - the more activities on freeing the content, the merrier! --Godzzzilica (talk) 13:32, 8 January 2015 (UTC)Reply

Additional sources of revenue edit

I'm interested to see what the additional sources of revenue will be spent on. --FiliP ██ 17:00, 7 January 2015 (UTC)Reply

Inside of the budget, you could see that the additional sources of revenue will be spent on:
Ah, you're right, I missed that. Sorry and thanks :) --FiliP ██ 17:19, 8 January 2015 (UTC)Reply
You are welcome (: --Millosh (talk) 20:04, 8 January 2015 (UTC)Reply

WMF comments edit

Thank you for all the work you've put into this complex request and efforts at establishing a partnership with the Matica Srpska. It is an interesting project and while the topics are quite obscure, the potential for useful software and future partnerships are significant. The crux of the project seems to be having interest and commitment to engage from not only the Serbian Wiktionary community, but also the larger Wiktionary community. We appreciate the efforts you've made to notify the different language communities and mailing lists. From our research, you have received basic level of interest from the Esperanto and Spanish communities and endorsement from French and English. We are disappointed that there has been no engagement from the Serbian Wiktionary community or the rest of the 18 other communities you notified. Since so much of the impact of the project relies on engagement from these communities, please let us know how you plan to increase participation.

Please clarify that the "expert consultancy" will be to pay paid staff of Matica Srpksa to review/organize the material. If yes, it would be helpful to understand why the staff need outside funding for this work.

Thanks, Alex Wang (WMF) (talk) 23:19, 25 January 2015 (UTC)Reply

Alex, thanks for your comments! Here are the answers related to issues you've been raised: --Millosh (talk) 17:58, 26 January 2015 (UTC)Reply
  • First, here is the hierarchy of the goals to be achieved:
    1. Liberating the content of two dictionaries. To do that, we need Serbian and Serbo-Croatian Wiktionary communities not to object our project, which should be counted as done, as nobody did that during the past ~month. --Millosh (talk) 17:58, 26 January 2015 (UTC)Reply
    2. Creating the software for parsing dictionaries, which could be used as the basis for the next similar projects. That's why we have programmer. --Millosh (talk) 17:58, 26 January 2015 (UTC)Reply
    3. Creating online lexicographical tool as free software (initially just for Matica srpska and Interglider), which could be used for variety of purposes, including possibility to be integrally used for structured dictionary data as Wikimedia project. --Millosh (talk) 17:58, 26 January 2015 (UTC)Reply
    4. Creating the basis for the future cooperation with Matica srpska in relation to liberating their content. Successfully finished project (previous three points) would do that. --Millosh (talk) 17:58, 26 January 2015 (UTC)Reply
    5. Initialization of cross-Wiktionary cooperation. That's heavily dependent on the size of particular Wiktionary communities, as well as volunteer willingness to spend time on doing that. --Millosh (talk) 17:58, 26 January 2015 (UTC)Reply
    6. Widening cooperation with Matica srpska to relevant similar organizations all over the other Slavic countries. That depends on WMF and particular chapters. --Millosh (talk) 17:58, 26 January 2015 (UTC)Reply
In relation to the Wiktionary communities, it's very important to understand the size and the structure of them. There are just few Wiktionary communities of the size relevant enough to be capable to give answer to our idea and -- more or less -- those communities expressed their positive attitude toward our project. --Millosh (talk) 17:58, 26 January 2015 (UTC)Reply
At the other side, you could see, for example, that Serbian Wiktionary community is practically consisted of people working on the project 1001 Arabic Word and that the community in the sense of English Wiktionary or Serbian Wikipedia doesn't actually exist. Thus, it isn't reasonable to expect higher level of participation from their side. At the other side, we actually don't depend on willingness of Serbian Wiktionary community to contribute, as the most of us are actually native speakers of Serbian. --Millosh (talk) 17:58, 26 January 2015 (UTC)Reply
In relation to the other communities in general, our call for participation is very general and it doesn't require specific action, except option for general support. (Besides the fact that we haven't yet contacted some of the significant communities.) Thus, it's about different kind of response than we expect when particular call for action would be announced. In that sense, valid option is to share much more widely our call for participation. For example, when project starts, we could ask WMF to publish our call on Wikimedia blog; we could share our call for participation on social networks, we could make a press release and try to reach as many languages as possible, I could personally ask my friends all over Wikimedia movement to call their communities etc. --Millosh (talk) 17:58, 26 January 2015 (UTC)Reply
BTW, Italian Wiktionary community expressed at least basic level of interest, as well. --Millosh (talk) 17:58, 26 January 2015 (UTC)Reply
Matica srpska is the institution funded by Serbian government (and other sources of revenue). Serbian government is covering basic staff salaries and capital projects, like The Dictionary of Standard Serbian Language and Serbian Encyclopedia are. However, presently they don't have enough resources for completing side projects, like those two dictionary are. Besides their staff is anyway doing a lot of background job for those dictionaries anyway, they need extra resources for paying external contributors (for example, Ornithological dictionary needs an expertise from few ornithologists; gathering examples for the Dictionary of Serbian speeches of Vojvodina also requires external work). --Millosh (talk) 17:58, 26 January 2015 (UTC)Reply
Said so, it should be noted that any of those dictionaries are basically much more expensive if we'd count resources necessary for their creation. It should be noted, as well, that their position toward our movement is very friendly and that any kind of commercial deal would be much more different. I see that as a good opportunity for Wikimedia to start the cooperation, gradually liberate enormous amount of content and, actually, to get one nationally and regionally influential organization on the side of free content. --Millosh (talk) 17:58, 26 January 2015 (UTC)Reply
I also want to add that I intentionally lowered their contribution for the first project to be able to build their trust, as this kind of cooperation is unique and new. (One of the things I want to analyze at the end of the project and even during the next couple of years is the economic impact on Matica srpska because they liberated the content. Based on the existing data, I think it won't be negative. That kind of analysis could be the example for the other institutions willing to liberate their content.) The idea for the next projects is actually that besides the possible future projects (for example, Herpetological Dictionary), we'd get capital dictionaries (keep in mind that we will never have capacities to liberate more than two or three dictionaries per year). --Millosh (talk) 17:58, 26 January 2015 (UTC)Reply
Besides that, we'll need their expertise to do the job. Thus, their "technical support" would go into two directions: (1) for external contributors and overtime of their staff and (2) support to our team itself in realization of the project. --Millosh (talk) 17:58, 26 January 2015 (UTC)Reply
Hi Millosh. Thank you for the detailed responses. We have read all of your responses to us and the GAC regarding community engagement. However, we absolutely need more engagement from active Serbian Wiktionary users (or Wikipedia users who participate in Wiktionary) to move forward. We need to know that they welcome this material and will work with the programer as needed. If you have positive engagement from the Serbian community, we are willing to experiment with this project. It is content that would have been inaccessible otherwise and the cost per entry is quite low. We would want commitment on the following:
  1. Any software developed will be released under a free license and hosted on a public repository -- either the foundation's own gerrit or github.
  2. After approval, we would require written commitment from Matica Srpska for their responsibilities to the project. We would like to avoid any problems due to management turnover, etc. We will also need their commitment in writing that they will release the material under the CC-BY license on Wiktionary. We understand they maintain the right to print and sell the material on their own for commercial purposes, but they should be clear that whatever material given to Wiktionary under the CC-BY license can be used/printed/sold commercially by anyone. We cannot prevent commercial printing by others once the material is freed.
We understand that only part of the software developed could be useful for other projects. We hope that your international call to action will engage other Wikitionary communities who will make the tool more useful and able to scale across language projects. Alex Wang (WMF) (talk) 18:07, 27 January 2015 (UTC)Reply
In relation to the numbers of editors willing to participate on Serbian Wiktionary, please note that the relevant number could be one or two. In that sense, my personal engagement as Wikimedian (and former bureaucrat of Serbian (and Serbo-Croatian) Wiktionary, not anymore since last summer, as I wasn't active and I didn't see a sense in keeping permissions which I am not using) would be relevant. Otherwise, if you insist, I could ask personally some of active Wikipedians to participate in this project. However, the number of people interested in Serbian Wiktionary is low (without WMRS-lead project(s), there would be probably none) and I don't think it's reasonable to "force" people to do something what they don't want (either because they feel they would be "guilty" if this project fails, while it's about significant national project, either because I asked them). At the other side, this project will definitely significantly rise visibility of Serbian Wiktionary inside of Serbia and I am sure that we'll get contributors in more natural way during the project itself than pursuing bureaucratic requirements. --Millosh (talk) 19:03, 27 January 2015 (UTC)Reply
Yes, of course, software will be licensed under GPLv3 or AGPL, depending on the type of the software and external issues. --Millosh (talk) 19:03, 27 January 2015 (UTC)Reply
Matica srpska already clearly expressed their intent to open content under CC-BY-SA with expressing clear understanding what that license means inside of the letter to Wikimedia Serbia. You could ask WMRS to send you a copy of that letter (in Serbian, but it could be translated). We already made a deal with Matica srpska to switch the license to CC-BY, as it's more appropriate with the dictionary data, compatible with CC-BY-SA-licensed Wiktionary and would give space for potential Wiktionary license switch. --Millosh (talk) 19:03, 27 January 2015 (UTC)Reply
What I want to do and what I asked them -- although, they were willing to go further -- is that they give to us the content of their dictionaries, not the dictionaries themselves. --Millosh (talk) 19:03, 27 January 2015 (UTC)Reply
As one of the funders of the project, my company is in the process of signing the contract with them. We wrote the draft of the contract, which will be sent to them during this week and their lawyers will analyze that and send it back to us for signing. Matica srpska itself requires creating formal contract with WMF, as well (WMF and my company could sign a contract, as well, if WMF is required or would like to do that). Thus, not only that the management is not likely to change, but we'll have the formal contract with them, which would make the deal non-dependent on particular management. --Millosh (talk) 19:03, 27 January 2015 (UTC)Reply
Thanks for the responses Millosh. We have a couple of remaining questions and responses to your comments above.
  1. Please confirm that the software developed will be hosted on a public repository -- either the foundation's own gerrit or github.
  2. We're happy to hear that Matica Srpska has committed to releasing the material under CC-BY license. It would be helpful to have a copy of the agreement signed between Interglider and Matica Srpska. However, we would not sign an agreement between WMF and Matica Srpska. As a general rule under our grantmaking programs, our grant agreements are with the grantee and we do not sign agreements with third parties. Please let us know if this will be an issue.
  3. It would be helpful to have more background information on the proposed team since they do not seem to have Wikimedia accounts. Please provide more details on their experience and relevant skills.
Thanks, Alex Wang (WMF) (talk) 19:54, 11 February 2015 (UTC)Reply
Here are the answers: --Millosh (talk) 22:06, 11 February 2015 (UTC)Reply
  1. Yes, software will be hosted on a public repository. I think it's logical to be hosted on WMF's repository. --Millosh (talk) 22:06, 11 February 2015 (UTC)Reply
  2. Hmm... There is a general option that we do everything that via newly formed non-profit organization. But, as we applied as a group, we assumed that WMF will directly give money to Matica srpska and to programmer. In relation to Matica srpska, I think not just Matica srpska would need a kind of contract as a basis for getting money from WMF, but WMF as well. At least in the form of "we are giving money based on your contract with Interglider". It's important to make clear this part, as many formal issues depend on precise details. For example, we assumed that WMF and Matica srpska would make fully-flagged agreement and thus we referred to WMF as funder inside of our contract. --Millosh (talk) 22:06, 11 February 2015 (UTC)Reply
  3. Basically, all of the team members are more or less already involved in free software, free content and nearby movements:
    • Milica Gudovic actually has some contributions back to 2010. She was CEO of Women at work, an organization focused on feminism and technology, which was active from mid 1990s to 2012. For the last ten years, she is in close surroundings of WMRS. If I remember well, during the early days of WMRS (2006-7), WMRS Board had at least one meeting inside of the office of her organization. She is a veteran activist, very well introduced into NGO bureaucracy and the final forms of this project, as well as AfroCrowd project, are the products of her work. --Millosh (talk) 22:06, 11 February 2015 (UTC)Reply
    • Tuuli Pollanen has also some contributions back to 2010 and 2011. She is from Helsinki, Finland, but the last seven years she spent in Ljubljana, Slovenia, studying neuroscience. She was essentially involved into Kiberpipa, the main (and only?) hackerspace of Ljubljana, Slovenia; being the principal organizer behind making the new Kiberpipa's space functional, last year. Presently she is based in San Francisco, which means that you could meet her face to face. --Millosh (talk) 22:06, 11 February 2015 (UTC)Reply
    • Milos Trifunovic is a master of philosophy, who was serving as librarian of the Humanitarian Law Center during the past years. He is also proficient Python programmer. --Millosh (talk) 22:06, 11 February 2015 (UTC)Reply
    • Senka Latinovic is an activist and curator by vocation. She is presently working for my company in capacity to support our business and non-profit activities (which led to interglider.org as a group and soon as a non-profit organization). --Millosh (talk) 22:06, 11 February 2015 (UTC)Reply
    • Irena Antonijevic is an activist and anthropologist by vocation. She is also working for my company in capacity to support our business and non-profit activities. It's relevant also that she her activist interests are going on Milica's line: feminism and technology. --Millosh (talk) 22:06, 11 February 2015 (UTC)Reply

Hi Millosh. Thanks for the updated information. In regards to who is the grantee, it was our understanding that the grant would be to Interglider.org and you would be responsible for paying Matica Srpska and the programmer, as well as for completing the final report. We cannot enter into an agreement separately with Matica Srpska. Please let us know if this will work. Alex Wang (WMF) (talk) 22:19, 11 February 2015 (UTC)Reply

Grant Approved edit

This grant has been approved. While we are concerned about the engagement of the larger Wiktionary community and the fact that the dictionary content is quite obscure, we do feel there is value in experimenting with this project. We are particularly interested in the software that is to be developed and how it can benefit the larger community. We are also hopeful that this is the start to a longer-term relationship with Matica Srpska. Hopefully the project team is successful in their outreach campaigns and can generate wider interest in the longer-term benefits of the project. Good luck! Alex Wang (WMF) (talk) 01:02, 13 February 2015 (UTC)Reply

Return to "PEG/Interglider.ORG/Wiktionary Meets Matica Srpska" page.