Abstract Wikipedia/Updates/2022-12-19
◀ | Abstract Wikipedia Updates | ▶ |
Evaluation of the project by the Google.org fellows
editDuring the fellowship, the Google.org fellows gained detailed insight into the Wikifunctions and Abstract Wikipedia project. With the goal to point out potential issues and to discuss potential alternatives to some of the project’s approaches, they wrote a detailed evaluation of the Wikifunctions and Abstract Wikipedia projects.
The team read through the evaluation and wrote a detailed answer. We will take a lot of the suggestions of the fellows to heart and make sure to implement them. The evaluation and the answer also helped the team to gain a better shared understanding of the project.
We invite you to read both documents:
Ariel’s goodbye letter
editAt the end of the month, Ariel Gutman, who joined the Abstract Wikipedia project as one of the Google.org fellows, will be leaving. He was contributing to the Natural Language Generation (NLG) workstream. We want to give him the opportunity to say goodbye with his own words. Thank you, Ariel!
Over the last six months, I've been part of the Abstract Wikipedia team as a Google.Org fellow. At the Foundation, my aim was to leverage my expertise in Natural Language Generation, which I honed from working on NLG at Google for over six years, to advance the Abstract Wikipedia project.
The first half of the fellowship was mostly dedicated to writing design docs: The architecture of an NLG system and a template language specification (the latter co-authored with Maria Keet, to whom I’m grateful). At the same time I was involved in other discussions, be it the quality of lexical data on Wikidata, or the form Abstract Content should take (many thanks to Kutz Arrieta for leading the latter discussion).
At the midpoint of the fellowship, I felt the urge to create something more concrete. Unfortunately, the Wikifunctions platform was not ready to serve as a solid development platform, so, per the advice of the Google.Org Tech Lead Ori Livneh, I set out to create a prototype NLG system on Wikipedia’s Scribunto platform, a Lua-based scripting environment embedded within Wikipedia.
To my great pleasure, the Scribunto platform, with its Wikidata API, allowed me to rapidly create a functional NLG system capable of transforming Abstract Content into text (see recorded demo or example output). The system is not yet exhaustive, however it contains the necessary components, outlined in the proposed architecture:
- An Abstract Content repository, allowing the specification of an article outline for individual Wikidata items.
- A Constructors repository, containing logic for auto-creation of abstract content for Wikidata items, depending on their types (people, places etc.).
- Templatic renderers which are templates specifying how each constructor should be verbalized in the different realization languages.
- Template functions written in Lua or in the template language, to be used within template slots. These in particular allow importing of Wikidata lexemes and their representation in an internal format, using dedicated helper modules.
- Morphosyntactic dependency relations written in Lua using a limited set of unification operators, allow specifying the flow of grammatical features between template elements.
- Phonotactic functions written on Lua allow specification of language-specific phonotactic rules (such as the a/an alternation in English).
- Text assembler taking care of constructing the rendered text, while adjusting punctuation, spacing and capitalization.
On top of these there are modules with the necessary logic needed to parse and evaluate templates, represent lexemes and unifiable features and interact with Wikidata. The main module controls the overall flow of the NLG pipeline.
My primary aim in developing this prototype was to substantiate the designs I've proposed, and provide example code for a similar implementation on Wikifunctions. In fact, if Wikifunctions will support Lua, the code can probably be reused as-is. The modules in the above bulleted list would become user-editable functions, while those mentioned thereafter could be integrated in the backend system of Wikifunctions, as they are expected to be relatively stable.
Yet, there is a second, more subtle aim. During my fellowship, I have grown skeptical of the premise that Wikifunctions is necessary to achieve the vision of Abstract Wikipedia. While user contributions (e.g., functions, renderers, or constructors) are necessary for its success, these should be NLG-oriented and they do not need a general functional platform such as Wikifunctions. By focusing on building an NLG-oriented system, the vision of Abstract Wikipedia can more rapidly be attained. (Being part of a fellowship, it maybe shouldn’t come as a surprise that I'm on the "One Ring" side…). Together with my colleagues Ori Livneh, Ali Assaf and Mary Yang I've put my viewpoint in detailed writing. I believe that the template-language proposal, implemented in this prototype, is the good foundation to build upon.
The Scribunto prototype shows that a platform more limited than Wikifunctions can already be used to generate articles from Abstract Content on real Wikipedias. It suffices to copy over the necessary modules to the target Wiki, and define the language specific renderers, functions and relations. Whether you agree with me or not, I invite you to play around with the system and edit the relevant modules to add functionality for your favorite language.
As my fellowship is ending, I would like to thank all my colleagues in Abstract Wikipedia's Natural Language Generation workstream, for the passionate discussions and ideas. In particular I am thankful to Cory Massaro, the Tech Lead of the workstream, for his guidance and confidence, and to Eunice Moon, my Google.Org colleague and Product Manager of the workstream, for her superb organizational skills.
End of year break
editWe wish everyone a happy holiday and a Happy New Year 2023! We will take a break from writing updates until the week of January 13, 2023.
Updates from Development (as of December 16, 2022)
editFrom December 5 – 9 was a 'Fix-it' week for the Abstract Wikipedia team. During this week, the team paused the development of new features and focused on tasks related to technical debt.
The team also made a lot of progress in descoping work planned for before the launch. A lot of items were removed from the scope of the MVP.
In the week of December 11 to 16, the Abstract Wikipedia team participated in a small internal hackathon/collaboration, in order to get to know more areas of the codebase and our colleagues, and work on some assorted community wishlist entries. The team worked on projects including getting WhatLinksHere's lists in alphabetical order, Wikisource User Research to inform the larger suggestion of the platform needing support, enabling negation for tag filters, auto-suggesting linking Wikidata item after creating an article, and missing LaTeX capabilities for math rendering.