Wikimedia Clinic call #009 - August 21st 2020

(Attendance: ~6 staff; ~52 volunteers)

Topic 1: Introduction


quick principles

  • listen with patience and respect
  • share your experience, but remember others' contexts are very diverse, and may not match yours.
  • be of service to other people on the call

These calls are a Friendly Space.

Purpose of Wikimedia Clinics

  • provide a channel to ask questions and collect feedback on one's own work and context
  • help direct people to appropriate resources across the Foundation and broader Wikimedia movement

If we can't answer your questions during the call, we (WMF) are committed to finding who can, and connecting you (this may happen after the call)

Examples of things the Clinics are not the place for:

  • complaints about interpersonal behavior - there are appropriate channels for this on-wiki, and there is the Trust and Safety team.
  • content or policy disputes on specific wikis. But it is okay to seek advice on how to better present one's positions.

Topic 2: Abstract Wikipedia


Denny introduced Abstract Wikipedia using these slides:

Questions and answers

  • Question: Do you plan an implementation, where somebody can participate without this abstract thinking (or programming background)? Is it easily possible?
  • Question: +1. Is there a kind of dummy [site]?
  • Question: what can we do to help?
    • There are several things where we need your help:
      • Initially name & logo. This will kick off pretty soon, we want large demographics and communities to discuss the name.
      • You can help by reviewing the design documents. Share ideas that can help the project.
      • We have enough time to work on the natural language generation part. We can consider help in this aspect.
      • More opportunities to help will open with time. The project won't be launched till next year. It's still in design stages, so your help is much appreciated.
      • You can join discussions on the m:Abstract Wikipedia page on Meta, the AW mailing list, IRC and Telegram (channels linked from the Meta page).
  • Question:I am wondering how the global community will write sensitive (for example historical) topics. For example history of Israel. (I feel the possibility of edit wars and similar.) This looks more like a lingusitic project that needs language expertise & I believe it won't be easy.
    • Answer: I am convinced that we can reach out and create this team of lingistics experts. We hope we can create workflows - examples by language experts on flow of information. There is a massive need for linguistic knowledge.
  • Question: Will be the automatically created articles be a placeholder until articles are created by humans?
    • Answer: We don't want to replace any existing content; we just want to provide missing information. Yes, it will be a placeholder.
  • Question: Where these Z-items can be found? How to manage the passive voice?
    • Answer: Nowhere yet, they will exist in the new "wiki of functions" (codenamed Wikilambda) when it launches next year. These pieces of code will be community-created content. Re: active/passive voice etc, that's also something the language communities will need to create.
  • Question: I wasn't aware that WikiLambda is this ambitious initiative, aiming not only to provide precoded-but-generated (calculated)-on-demand templates, but also going to respond to questions in unencyclopaedic form (like what is the second largest city in ...) It is going to be very difficult especially without a master planner. You need a clear understanding of the question to provide an answer.
    • Answer: I agree that it won't be easy. I hope we can function without the master planner. We were denied one external grant as the financing body didn't believe that the community can create this project. But I believe the community can create a process to avoid the master planner. I agree it will be challenging, but hope that my trust in community will work.
  • Question: How are these composed functions executed? I understand they're semantic, and [some words missed by note-taker] - composability of AI models?
  • Question: You mentioned deterministic functions. Q: Will all the functions be only community-created, hard-coded, `if this then that` kinda models? Or will we have machine learning / NLP kind, non-deterministic stuff in the future?
    • Answer: In theory yes. The community could go this way if they wanted to. My own theory is that for generating the text we should not rely on machine translation, for a few reasons - not available across all languages; community cannot easily go in to edit and fix the ML models. But there are ways to use ML - e.g. boxes where editors write in natural language, ML gives a few suggestions, and editors then select and refine. This is one area things like GPT-3 could be useful.
  • Question: Will Abstract Wikipedia be useful for areas beyond the Wikipedias?
    • Answer: Yes, much like Wikidata, starts off supporting Wikipedia, and then will naturally become useful in other places.
  • Question: Will the wiki of functions work on MediaWiki, or need a lot of extensions? Will there be talk pages and an edit summary line?
    • Answer: We will have a first try on Mediawiki when AW is functionally less complete and then we will collect feedback and start working on the basis of work done rather than starting from scratch. We want to have the freedom to go back and learn. We are planning for a rich experience. It will be closer to Wikidata than Wikipedia.
  • Question: I've been experimenting with lexicographical data, hit similar problems, and then saw many of the gaps and limitations as we're discussing in the other thread. Especially subtleties in other languages. Subtleties can be encapsulated with logic and reasoning, but there's core knowledge that needs to be stored. I was having trouble with Lexemes and workspace storage. We need to do a better job of documenting that part.
    • Answer: The number of people contributing to Lexicographical data is not high. Way more coordination is needed to have this discussion. Refers to the lexicographical Telegram group for the active discussions.
  • comment: Applying semantic similarity on concepts and glosses can be also too useful.
  • Question: How does this compare with the work done so far on automatically generating articles from Wikidata?
    • Answer: There has been ArticlePlaceholder work done by Lucie-Aimée Kaffee and there has been a number of research works to create text from Wikidata with mixed results. I want to enable all of these patterns within a framework that is helpful to the whole community. We need to learn from previous work and enable the best patterns discovered so far.
  • Question: What is the scope of the functions repository?
    • Answer: We want to provide power to create information on Wikipedia. Wikilambda is a space where you can evaluate those functions. You can create apps in your own space & distribute in a space where people can share peer to peer for reviews. There are lot of possible ideas to do that, we don't plan to provide everyone the editorial model of compute.
  • Question: Wikidata failed to solve rendering the date format correctly other than English in 8 (?) years. I am curious it would be possible to render full sentences from statements / triples :)
    • Answer: Painful question, and yes, true. That is my fault for never prioritizing it sufficiently. At some point I actually figured that the new Wiki of functions could maybe provide the solution for this problem, by making it less dependent on the development team and its prioritization, and allowing the community to take ownership of this issue. I hope that we can solve the date format issue in the next year, based on Wikilambda - and then I am curious to see how much of this solution we can integrate back into Wikidata.
  • Question: what skills and knowledge will the people need to do this kind of work? what kind and amount of training will they need?
    • Answer: Depends. There will be many different tasks, and people with different skills will be able to work together towards the common goal. This is something the Wikimedia communities have previously demonstrated well: there are people who can run bots, people who can write Lua modules, people who can copyedit. So we will have a similar diversity of tasks: there will be people who can write renderers, who can use functions, who can write text, who can provide lexicographic information, who will build abstract content, and who will integrate the results into their local Wikipedias.

Topic 3: Translation of modules


There are a lot of modules on wikis, some of them need translations. We are thinking of finding ways to translate modules similar to translate extensions. We will share more details on this next week. You can find more details on meta at the Language showcase.

Topic 4: Feedback on Abstract Wikipedia in various existing communities


Asaf (WMF) asked: How much are people in your community, around you, aware of, or interested in, Abstract Wikipedia? Do they understand what it is? Did you understand what it was, before the call? Do you understand it now? :)


  • Turkish Wikipedia has huge interest in bot-created articles. People are really appreciating filling the gaps using this project. I want to learn about language training & linguistics now.
  • still don't get it. sorry.
  • for nl.wp only a small number know about Abstract Wikipedia
  • Didn't understand it before the call, slightly understand it now, I think. Interest in my communities is likely to be tiny. Until it actually works and changes everyone's life like Wikidata :-D
  • I think some ukwiki folks have interest, but it is hard to gauge how big it is
  • I don't know about any discussions around Abstract in Arabic Community, I believe we need some basic education about it (I myself do).
  • in Bangladesh, no one really understands it; in West Bengal, people seem to be neutral but leaning positive
  • I saw the first public announcement & I understood the project from this session. I shared information with Hungarian community on what this project is & how it will work. I am hoping the understanding will increase in future. I heard more sceptical than optimistic opinions, but we can see nothing yet, so this can change after the implementation will be available.
  • [On Polish Wikipedia] It was more like pitchfork reactions. They didn't understand the consequences of the project or Wikilambda. We don't have all the answers. It was the fear of Wikipedia editors. It was similar to the reaction of ENWP editors to automated endeavours. People are less concerned about the readers experience, but more concerned about the editor's experience. It is more like a sociological & psychological issue, we need to work with community support people.
  • I was very enthusistic about the project & tried to explain it as much as i could. They didn't get much, but took my word that it is something we should be excited about. Swedish people are hesitantly hopeful & positive.
    • Asaf (WMF): do you think this is related to Swedish Wikipedia's fairly positive relationship with Wikidata?
      • Yes, I think so.
  • My approach to achieving a large number of Wikipedias would be different from Denny's. I believe that many of the projects didn't succeed much because I think we ask too much of people, and that's why Wikipedia doesn't work in many smaller languages. We have experience in Klexikon, the children's dictionary, where we have about three thousand entries, but they cover a very large share of searches. We have a rule: not too short and not too long: between 1K and 10K. We don't use Simple Wikipedia as a basis, as that didn't work. [...] If more attention were given to helping smaller wikis cover the most important core...
    • Asaf (WMF): I think what you decribe does exist in the form of "suggestions", like the Metawiki page List of articles every Wikipedia should have. There's also the Small Wiki Toolkits that were recently created to address technical needs.
    • Asaf also wants to point out that AW is trying to solve a different problem - you're suggesting the problem is "how to get people to edit a wiki"; AW is perhaps trying to solve the problem of "how can we nonetheless provide high-quality up-to-date content in languages where there aren't enough volunteers to produce content".
  • I hope to see WikiTemplate one day for a central template repository... I believe some people were working on that.
  • They may be very synergistic.
  • Yes, this is a long-time wish, I am looking forward to it as well.
  • I am still wrapping my head around this whole project. My first reaction was more about "translation" and how "translating" will mean that some of the articles will tranfer the bias they have in one language into another, with sometimes loss of relevance for a particular context. Granted I don't understand yet how this all works, so please take this with a grain of salt. Also I came at the end of this session so maybe this has been addressed and I'll just have to catch up.
    • (Abstract Wikipedia articles will not be "translated" from a particular language, but generated by code using language-specific templates and rules combined with language-neutral data.--AB)
  • We are raising the threshold for small communities. It can turn some people away but some may find enthusiasm.
  • Interesting take. Certainly it should be easier with children's encyclopedia, as children and their education are much more homogenic than the adults and their needs/articles they search. I agree, abstract Wikipedia is about enabling a possibility to have millions of automated responses with smaller maintainance cost.
  • Asaf (WMF): It would be important to have some ability to pick & choose, when presented with generated content. So that potentially shorter/poorer human-authored content can be enhanced and expanded using some of the generated text.
  • Nick Wilson (WMF): An article could have a section which is machine generated describing hard facts & kept up to date.
  • Yep. Merging abstract output with hand-written text will be important in the end of the day. Otherwise the maintainance cost will be high.