Abstract Wikipedia/Google.org Fellows evaluation

Wikifunctions & Abstract Wikipedia: an Evaluation

Authors: Ori Livneh, Ariel Gutman, Ali Assaf, Mary Yang

This is a sympathetic critique of the technical plan for Abstract Wikipedia. We (the authors) are writing this at the conclusion of a six-month Google.org Fellowship, during which we were embedded with the Abstract Wikipedia team, and assisted with the development of the project. While we firmly believe in the vision of Abstract Wikipedia, we have serious concerns about the design and approach of the project, and think that the project faces a substantial risk. Our goal with this document is to express these concerns, with the sincere hope that it will ultimately help the project in the long run. The opinions and views expressed below only reflect the views of the authors, and do not purport to reflect the opinions and views of Google or the Wikimedia Foundation. Any errors or omissions are our own.

Note: A response to this document is at Abstract Wikipedia/Google.org Fellows evaluation - Answer

Introduction

Abstract Wikipedia is a project of the Wikimedia Foundation to increase the amount of Wikipedia content that is accessible to readers in their own language. The vision of Abstract Wikipedia is a universal wiki in which encyclopedic content is encoded in a formal, machine-readable language, which the Abstract Wikipedia software would be able to translate for readers into their own language.

If Abstract Wikipedia is realized, it would advance Wikimedia’s mission of making the sum of human knowledge accessible to every person on the planet. We find this vision strongly compelling, and we believe that the project, while ambitious, is achievable.

However, we think that the current effort (2020–present) to develop Abstract Wikipedia at the Wikimedia Foundation is at substantial risk of failure, because we have major concerns about the soundness of the technical plan. The core problem is the decision to make Abstract Wikipedia depend on Wikifunctions, a new programming language and runtime environment, invented by the Abstract Wikipedia team, with design goals that exceed the scope of Abstract Wikipedia itself, and architectural issues that are incompatible with the standards of correctness, performance, and usability that Abstract Wikipedia requires.

In the rest of this document, we examine each of these points in detail. The document is divided into two parts:

Part one focuses on Wikifunctions in relation to Abstract Wikipedia: it explores the requirements for building Abstract Wikipedia and looks critically at the question of whether Wikifunctions, as currently scoped, is required. This part is more theoretical, because Abstract Wikipedia is not implemented yet (or specified anywhere in great detail). This part may be primarily of interest for people interested in NLG (natural language generation).
Part two is a design critique of Wikifunctions. Because large parts of Wikifunctions are already implemented, this part is more like a traditional software design review, and may therefore be of interest to software engineers.

We conclude by proposing changes to the plan that we think would make it more likely to succeed.

Part I: Abstract Wikipedia

In this section we explore the requirements for building Abstract Wikipedia and investigate whether Wikifunctions, as currently scoped, is required for developing Abstract Wikipedia.

What is Wikifunctions, and what is it for?

For Abstract Wikipedia to work, the system must have knowledge of the grammars of the diverse languages that it hopes to support. In order to translate content from a language-independent representation of meaning into natural language, Abstract Wikipedia must implement in some fashion the complex systems of rules that determine things like the plural form of nouns in Lingala or the order of adjectives in Kazakh. These rules need to be specified in a form that the software can evaluate. Thus the Abstract Wikipedia NLG system can be thought of as a collection of algorithms that take different forms of abstract content as input and produce natural language as output.

Abstract Wikipedia needs people to implement these algorithms, so the system can work for their language. To facilitate that, Abstract Wikipedia proposes a collaborative, wiki-based programming environment, called Wikifunctions.

The goal for Wikifunctions is to create “a wiki of functions”: an editable repository of computer functions, where volunteers work together on implementing language functions for Abstract Wikipedia, following the collaborative model of Wikipedia and its sister projects.

That, at least, is one way of telling the story of what Wikifunctions is. There is, however, another story, in which Wikifunctions is understood as an end in itself and not a means. The Abstract Wikipedia team’s plan for Wikifunctions is that it should not be limited to the functions needed by Abstract Wikipedia, but should include many other functions besides, so as to be an encyclopedia of code.

Thus, while Wikifunctions has been presented as an auxiliary to Abstract Wikipedia, necessary for its realization (for example, in Architecture for a multilingual Wikipedia^[1]) it is also regarded by the Abstract Wikipedia team as an independent product. Although the two visions are not logically at odds with each other (Wikifunctions could be both a means of achieving Abstract Wikipedia and an end in itself, which would be nice) the different objectives are exerting opposing forces on the design of Abstract Wikipedia.

In this section, we set aside the question of Wikifunctions as an end in itself, and consider it strictly in relation to Abstract Wikipedia. The two projects were bundled together on the premise that Wikifunctions is a necessary building block to achieve the goals of Abstract Wikipedia, and it is worth considering the merits of this argument.

In the following we make several claims:

As currently scoped, Wikifunctions is not a necessary building block of Abstract Wikipedia.
Wikifunctions is not the fastest route to substantially realizing the goals of Abstract Wikipedia.
Wikifunctions has, in our opinion, a number of problematic design decisions, which puts the feasibility of the project at a high risk, and thus risking the goals of Abstract Wikipedia as well.

From the above we conclude that in order to expedite (indeed, enable) achieving the goals of Abstract Wikipedia, a more viable alternative is to create a restricted environment (possibly, a Wiki environment) dedicated to authoring Abstract Content, Grammars and NLG renderers in a constrained formalism.

Below we elaborate on these points and discuss various specific alternatives which can serve as the above constrained environment.

Is Wikifunctions necessary for Abstract Wikipedia?

According to the Abstract Wikipedia architecture paper, Wikifunctions should serve two main purposes needed for Abstract Wikipedia:

It should host Constructors, which are the data types filled with the “abstract content” to be realized as articles.
It should host the Renderers, functions to transform abstract content into text (possibly through intermediate representations).

For Constructors, given that these are specifications of data containers, it would be quite natural to host these in Wikidata, rather than a separate platform. Clearly, a lot of the data which will populate the constructors, forming the actual “abstract content”, will stem from Wikidata, and thus a separation between the two is not desirable. However, one may argue that Wikidata is not ideally suited for this, as the development of the Constructors would require a lot of experimentation, not suitable for a mature project like Wikidata. To allow for such experimentation, it would suffice to set up a fork of Wikidata, with capabilities to represent new data-types, rather than create a completely new software system.

What about renderers? Clearly, as Denny pointed out, the bulk work of creating NLG renderers would fall on the community of Wikipedia volunteers. This is not only needed to follow the spirit of Wikipedia projects, but also given the fact it would be impossible even for a modestly-sized team of developers to create all the needed renderers across all the needed language to make Abstract Wikipedia a reality. Hence, the necessity of a collaborative development and computation environment such as Wikifunctions.

While the core argument is correct, the fallacy lies in the scope of the Wikifunctions project, which is intended to cover any conceivable computable function (and also using various implementation languages, see below). Since NLG renderers are specific types of functions (transforming specific data types into text, possibly using specific intermediate linguistic representations) it would suffice to create a platform which allows creating such functions. There are many extant NLG systems, and some, such as Grammatical Framework, already have a vibrant community of contributors. Instead of creating a novel, general computation system such as Wikifunctions, it would suffice to create a collaborative platform which extends one of these existing approaches (or possibly creating a new NLG system adapted to the scope and contributor-profile of Abstract Wikipedia, as suggested by Ariel Gutman).

Would Wikifunctions expedite Abstract Wikipedia?

While a full computational platform may not be a strict necessity for developing Abstract Wikipedia, one may argue that it would allow a quicker and better development environment than any restricted, chosen ex officio, NLG system. Several reasons are alluded to:

None of the existing NLG systems are suitable for all languages, so one cannot be chosen a priori.
It is unknown in advance which system would work best for the scope of Abstract Wikipedia, so it is better to take an approach of “let a thousand flowers bloom”.
Different developers may feel comfortable/be knowledgeable in different formalisms and programming languages. To reach the widest audience possible - all possibilities have to be open.

While point #1 is true, there are sufficiently general NLG systems which could cover, at least in theory (and in practice, with some extra development) all (written) human languages. Indeed, as mentioned above, one such system, Grammatical Framework, which has been mentioned before as a possible candidate, currently supports (to a certain degree) about 45 languages of various linguistic families. Moreover, the state of NLG technology allows developing an in-house system general enough for maximal linguistic coverage (such as the templatic system proposed by one of the authors).

Points #2 and #3 are more difficult to refute, because it requires diverging from the very democratic principle of governance of Wikipedia projects. Yet, as is known from various studies in human psychology, sometimes abundance of choice leads to suboptimal results. Indeed, enabling community contributors to create their own NLG systems would inevitably lead to reduplication of efforts and waste of resources. First, given the complexity of a full NLG system, some efforts may not lead to fruition, or may not be adopted by the wider community. Second, given that some NLG solutions may be developed and used for different natural languages, a contributor who would like to contribute to writing renderers for several languages, might need to ramp up in several formalisms. Third, diverging NLG systems may lead to diverging Abstract Content representations, undermining the cross-lingual nature of that content.

These factors, instead of encouraging contributors to participate in the effort, might have a deterring effect. Instead, we believe that selecting or developing in advance one good-enough NLG system would funnel the community efforts to contribute in an optimal way, leading to quicker tangible results, which in turn would attract more contributors. An ideal system should treat the renderers and grammars (of natural languages) as “data” (e.g. templates), minimizing the need to know programming in order to contribute.

It is important to stress that a sound design of an NLG system should start with a design of the data on which it operates, and in our case, the specification of the Abstract Content. Indeed, once such a specification has been agreed upon, the design of a compatible NLG system is significantly constrained. Given that the Abstract Content must have a unique, cross-linguistic, design, in order to fulfill the goals of the Abstract Wikipedia vision, establishing as a goal support for a plurality of NLG systems is in fact a chimera.

As the contribution base will grow, it is likely that contributors would signal missing features (or bugs) in the selected NLG system. Depending on the platform on which the base code of the system has been developed (e.g. in Gerrit, or in Scribunto) either the Abstract Wikipedia team, or volunteer contributors could step in and implement those missing features. Since the base should mostly be natural-language agnostic, this should be manageable even by a small team of software engineers.

Is Wikifunctions adequate to developing AW?

The Wikifunctions environment, as currently designed, will be most adapted for creating small, stand-alone, functions. However, a fully-fledged NLG system (as needed for Abstract Wikipedia) is a software suite, rather than an aggregation of individual functions (see, as an illustration, proposed architecture document). Even if it should be possible to implement such a system in Wikifunctions, it is clear that Wikifunctions does not provide a good enough UI and debugging capabilities for maintaining such a large software system.

A more principled problem is the fact that Wikifunctions is intended to use a pure-functional computation model. While technically possible, writing an NLG system as a combination of pure-functional functions is less intuitive for the average programmer (as very few programming languages follow this model) and would moreover incur added complexity and latency. A typical development environment for an NLG system should allow the different components of the system to access a global state (shared memory), in order, for example, to allow unification of linguistic feature structures, or keep track of discourse references for pronominalization logic. Yet a pure-functional computation model excludes the use of global state.

Another related issue is that of randomness and non-determinism, which are excluded by a pure-functional model. A typical NLG system needs both: randomness is required to allow some variation in the output of the system, while non-determinism is inherent when relying on external data sources, such as the Wikidata lexicographical data, or the system’s time (which can be useful, for example, to calculate and verbalize the age of a person).

Outlook

As we shall see below, the design of Wikifunctions suffers from several problems which make its usability (for NLG and in general) questionable, and puts at risk its success as a viable platform. Given the above points, rather than accelerating the development of Abstract Wikipedia, the reliance of the latter on Wikifunctions creates a compound risk, which rather than expediting the attainment of Abstract Wikipedia’s goals, puts them in peril.

Part II: Technical design of Wikifunctions

In this section, we look at some core design decisions of Wikifunctions, and how they affect the viability of the system.

Support for multiple programming languages

Wikifunctions currently allows users to implement functions in Python, JavaScript, or Lua, and the project has a goal of supporting more languages in the future. As we understand it, the motivation for supporting multiple language runtimes is to allow contributors to contribute in the programming languages that are most familiar to them, including (at some future date) some that are not based on English.

In addition to these implementation languages, Wikifunctions also allows users to define functions by composing other functions together. While this may seem like a simple nice-to-have feature, this extra composition layer constitutes an entirely distinct language on its own. It is Turing complete and can be used to define any function without ever resorting to the above-mentioned implementation languages.

The top-level composition layer is handled by a different service called the orchestrator. It is essentially a hand-written interpreter for the Wikifunctions programming language that can also make calls to the evaluator to execute the parts that are written in the other languages.

In this section we examine the different problems caused by supporting multiple programming languages.

Security

Container escape vulnerabilities are common. By themselves, containers do not provide adequate sandboxing for running untrusted user code. Relying on containers as the sole security boundary means the team will need to invest significant effort in securing and maintaining a sandboxed execution environment. Each language runtime running in such an environment is liable to contain security vulnerabilities that could be utilized as part of a container escape. The team will need to invest significant effort to ensure each language runtime is sufficiently hardened for running untrusted user code, and will need to keep up with security updates for all the different software components that comprise each language’s runtime.

Nondeterminism

Wikifunctions is meant to support pure functions: calling a function multiple times with the same arguments should always return the same result. However, it is currently quite easy to write functions on Wikifunctions the output of which is influenced by external state. For example, the Python built-in id() returns the “identity” of an object as an integer. The CPython implementation simply returns the object’s address in memory. Since the precise location of objects in memory changes across multiple invocations of the Python interpreter, this is a source of nondeterminism.

To guarantee that Wikifunctions implementations are pure, it will be necessary to carefully audit each supported language for sources of nondeterminism, and either turn off the associated language features, or find some way of making them deterministic. This requires substantial effort.

The alternative is to make Wikifunctions handle nondeterminism. This would represent a significant complication of the current system. Wikifunctions would not be able to trust that a function implementation is correct if it passes the test cases, even in the narrow sense of being correct for the specific inputs being tested. Function implementations may be flaky, test runs have to be repeated, etc.

More bug-prone; harder to debug; and harder to optimize

Debugging and optimizing a system that spans multiple implementation languages is much harder than a system that uses a single runtime.

Each language requires supporting code to integrate with Wikifunctions. This means that the system as a whole must contain more code, and has a larger surface for bugs.

Support for multiple languages also makes code harder to debug. If different components of Abstract Wikipedia are implemented in different languages, debugging an error requires competence in all the different programming languages that comprise the functionality that manifests the bug. This runs counter to the goal of making Abstract Wikipedia accessible to non-experts.

Support for multiple languages also makes it harder to optimize performance. Different languages have different data structures with different performance characteristics, use different algorithms for various operations, and have different idioms for accomplishing common tasks in an efficient way. Fragmenting the implementation over multiple languages makes it harder to accumulate the necessary expertise to optimize the system.

Additionally, different language runtimes provide different debugging and profiling facilities, such as:

Facilities for single-stepping through code, or inserting breakpoints, or REPL
Different languages produce different kind of debugging output (e.g. different formats of stack traces)
Different profilers, different outputs.
Providing unified abstractions / tooling for these different environments is a large task.
Profiling / tracing the execution of an invocation that spans multiple instances / containers / languages / etc. is very difficult.
- Multiple languages means data has to be serialized and de-serialized at function call boundary, which is bad for performance.
- Can’t inline implementations to reduce call overhead.
- High startup overhead (each function call requires spawning a new interpreter)

Managing change is onerous and time-consuming

Programming languages and core libraries change frequently. Some changes are critical security updates, some introduce breaking changes.
If the goal is to provide users with the programming facilities they are already accustomed to, users have reasonable expectations of having software be up-to-date.
Users and stakeholders will want a lifecycle management policy to understand the frequency and cadence of breaking changes.
Language runtimes and library ecosystems follow different release schedules, use different versioning schemes, and provide different support and backward-compatibility commitments.
- Maintainers of the system will need to keep up with all of them and have sufficient competence to decide how and when updates should be rolled out.
- Wikimedia Foundation uses Debian Linux packages. Will have to keep up with the pace of change of Debian distribution.
- Upgrades can introduce new versions of dependencies that may need to go through a review process to determine licensing compliance, maintainability, security, etc.
  - This is work that cannot be farmed out to contributors.
  - Managing these upgrades will be a drain on engineering resources.
If core components of Abstract Wikipedia use different runtimes, a breaking change to any of these can cause the system to malfunction.
- Counterpoint: [argument made by Denny] by function specification from implementation, Wikifunctions encourages multiple implementations, so the system can be resilient to breaking changes by automatically defaulting to alternative implementations (there is “referential transparency” across different implementations).
  - This, however, presupposes a massive duplication of effort. To have the necessary level of redundancy every piece of the system will have to be implemented multiple times.
Constant churn / update windows because many more updates will need to be released

Fragmentation

Not standardizing leads to fragmentation and duplication of effort. Standardizing a language would be a much more efficient way of growing a community of developers.

As we saw with Scribunto, when you have a common framework, people produce common documentation, tooling, etc. There is an accumulation of expertise on how to write and debug code for the domain. Common idioms emerge, get librarized.
Common functionality needs to be implemented in all languages. It is desirable to provide a library of helper methods appropriate for the domain of natural language generation from Wikidata data. Implementing these as Wikifunctions functions isn’t an option because of speed (Wikifunction calls occur across a network boundary, and incur all the associated costs, such as network latency and the cost of data serialization and deserialization).

Questionable benefits

Aside from the various problems it introduces, we find that the case for the value added by supporting multiple languages is weak. If the goal of supporting multiple languages is to provide for users a programming environment that feels familiar and allows them to leverage their existing programming skills, we find that this goal is not fulfilled by the current design.

The design of Wikifunctions forces users to adapt to the Wikifunctions programming model. The purely-functional model means that parts of the standard library or common third-party libraries cannot be exposed (or would not work as expected). Users cannot use global state, and (given the security constraints placed on the function evaluator) cannot make network calls, etc.

Programming for Wikifunctions means having to work with the Z-Object system. Code that deals with Z-Objects looks unnatural and unidiomatic in every implementation language.

For example, consider the following function implementation, which increments a number:

def Z10001 (Z10001K1):
    return  ZObject(Z10001K1.Z1K1, Z10000K1=str(int(Z10001K1.Z10000K1)+1))

Code like this is hard to read or write in any language. The fact that you can write it in Python (or JavaScript, or Lua, etc.) does not make the system more approachable. On the contrary: it makes the learning curve for using Wikifunctions steeper. We think most programmers would find it easier to adapt to a common programming language than to learn to write Z-Object code in a language they already know.

In discussions about this subject, we have sometime heard it said that the complexity of the Z-Object system, while unfortunate, is not so bad, because the majority of Wikifunctions users will never see it, but will instead view and write code via a set of graphical interfaces that use graphical representations that are easy to understand. We find this unconvincing for two reasons:

Abstractions leak. While graphical interfaces can cover many use-cases, users will encounter bugs in the system, and will have various use-cases that will require them to become familiar with the underlying implementation. They will need to have an understanding of the underlying system, so it is important that it is comprehensible.
If the focus of Wikifunctions is on the usability of the system for non-programmers who will interact with the system via graphical interfaces, that is all the more reason to choose an efficient, standardized language and type system for the backend, rather than inventing a new one.

In sum: support for multiple languages imposes an uncontrollable future maintenance and complexity burden on the project, which could easily be avoided by standardizing on one programming language.

In the next section, we consider the Wikifunctions function model.

Function model

By “function model” we mean the programming language of Wikifunctions that allows users to construct objects, express functions, and evaluate them. It is at the same time a new way to represent data in a language-independent way, as well as an entirely new programming language designed to both run on its own and interface with various already existing programming languages.

Designing a new programming language from scratch is a huge endeavor that is not to be taken lightly, let alone with ambitious interoperability objectives such as these. Having a good initial design is crucial as it is very hard to change once the project is underway.

Once there’s a body of programs written changing the function model introduces breakage.
A recent example is the typed list migration. It was much harder than expected, even though it was a relatively minor change in the model, and there were not even any existing users yet to worry about breaking their code.

Moreover, the function model does not make use of already established systems in existing languages and programming language theory, but instead defines a completely new system with odd features that are not commonly found in other programming languages. While the system attempts to represent data and computation in a novel and universal way, the semantics of the language are not clear and are hard to define. The documentation covers the syntax well but does not go into sufficient detail about how the various constructs are expected to behave. The implementation also has problems and cannot constitute a good reference. Initial design choices make it difficult to reconcile the various features, and attempts to give a well-defined meaning have encountered various obstacles.

Here are some of the problems that the function model suffers from:

The language mixes types and objects in an unrestricted way and allows arbitrary computations in types. This causes several problems, from circular definitions that are hard to untangle, to undecidable typing. While languages that mix types and terms do exist, they usually do so in a very careful and controlled manner, with extensive theory built on decades of research to show that they are well-behaved, which is not the case here.
The language confuses syntactic and semantic notions. Function calls and variables have their own types and those types drive the operational semantics of the language. This defeats the purpose of types and makes the definition of the semantics harder. It also causes problems of confluence in the language (the result of evaluation greatly depends on the order of evaluation and can give surprising results).
Other notions that are seemingly important for the semantics of the language such as identity fields are also represented as fields in ZObjects. Given that users can write whatever they want in those fields, it makes reasoning about the language hard.
The notion of validation is not well-defined, and attempts to define it encounter obstacles due to the evaluation strategy and the recursive definition of objects and types.
Types are bundled as part of the object, and not only at the top-level but in every sub-object as well. This leads to both validation and performance issues.

The problems above pertain to the design of the function model. However, there are also problems coming from the current implementation (less important than the above but still worth mentioning):

The current implementation is in Javascript, a language poorly suited for writing an interpreter due to its dynamic nature and its inexpressive data type representations. This makes extensibility, refactoring, and maintainability hard.
The current implementation uses text JSON to exchange data, which is inefficient. Basic objects already have huge representations (compare this for example to the very efficient representation of serialized protocol buffers).

This is covered in great details in the document Semantics of Wikifunctions, which attempts to define the semantics of the language, highlights the problems encountered, and gives some recommendations.

To summarize, creating a good programming language is hard, and having a good clear initial design is crucial. The Wikifunctions model ignores decades of programming language research and existing technology. Instead, it invents a completely new ad-hoc system, but that unfortunately does not seem to have good properties, and it is questionable whether it will be able to support a large, complex software system, as Abstract Wikipedia.

Developing a large software system collaboratively is challenging even with a good programming environment. If, in addition to the inherent complexity of the problem, contributors face usability and correctness issues and lack of stability in the language implementation, they are more likely to become frustrated and abandon the project. The task of implementing NLG components should be as hard as necessary, but not harder — contributors should not have to fight with the language to express things that are straightforward to do in existing programming environments.

Speed

In the previous sections, we have alluded to the performance problems that are incurred by the architecture of Wikifunctions. In this section, we want to briefly state why the performance of the system is important. At a basic level, Wikifunctions must make efficient use of hardware resources, and the latency of simple function calls should be low enough to avoid frustrating users and driving them away from the platform. More critically, however, if Wikifunctions is to serve as the execution environment for parts of Abstract Wikipedia, and if the goal for Abstract Wikipedia is to generate Wikipedia articles (or fragments of articles), the system must be able to keep pace with the rate at which Wikipedia content changes:

Users expect to be able to get information on major current events in near real-time (see for example the traffic spike immediately following the announcement of the death of Queen Elizabeth II).
Editors and administrators expect to be able to revert bad edits and correct erroneous information quickly, and for such changes to propagate quickly. (This has a legal dimension, too.)

In discussions with the team about the performance of Wikifunctions, two solutions are often mentioned: adding a caching layer for function calls, and rendering content in an asynchronous fashion (phab:T282585), outside of the request-serving path. Neither of these is a silver bullet.

Rendering content asynchronously is a complex change, and does not relieve the need to be able to generate content quickly. Any kind of delay in processing edits is unfortunate. It is tolerable for things like template updates because templates are not generally used to convey information on current events, and most templates are protected from vandalism, so there is less of a need for updates to propagate quickly. But Abstract Wikipedia is expected to generate article bodies (whole or in part), and changes to articles need to be processed quickly — in the order of milliseconds.

Caching could help make efficient use of computing resources, but it is not a silver bullet either. Because new content is (by definition) not yet cached, the speed of the system on uncached requests needs to be fast enough to meet the above requirements.

Scribunto

The plan for Abstract Wikipedia features a list of strong requirements for the project. We want to call out two items from that list in particular:

“The setup for Abstract Wikipedia and Wikifunctions should blend into the current Wikimedia infrastructure as easily as possible. This means that we should fit into the same deployment, maintenance, and operations infrastructure as far as possible.”
“The primary goal is supporting local Wikipedias, Wikidata, and other Wikimedia projects, in this order. The secondary goal is to grow our own communities. Tertiary goals are to support the rest of the world.”

In light of these requirements, it is important to remember that MediaWiki already provides a wiki-based programming environment: Scribunto. Scribunto replaced the ad-hoc and poorly-specified template programming facilities in MediaWiki with Lua, an easy-to-learn and general purpose programming language, originally developed in Brazil. With Scribunto, Wikimedia volunteers can write and interact with Lua code directly on wiki.

Scribunto is a success story. Since its initial deployment in 2013, thousands of Wikimedians have picked up Lua and have written code for many of the different language editions. Scribunto has also proven to be highly efficient and secure. It is also highly extensible. Notable among the different ways Scribunto has been extended over the years is the addition of interfaces for querying Wikidata.

Instead of attempting to invent so much from scratch, Wikifunctions could have been based on Scribunto. We think this is a huge missed opportunity. One of the longest-standing requests from the Wikimedia community has been to provide a means of reusing code across different projects by centralizing common code on a global wiki. This is a tractable, well-scoped problem (at least in comparison with Wikifunctions as currently conceived).

“V1” of Wikifunctions could have been a new, central wiki for common Lua code. Structuring the development plan for Wikifunctions in this way would have meant a much shorter, surer path to delivering value to Wikimedians. It would have satisfied a long-standing community request, and (we suspect) earned the team quite a lot of goodwill. The team could then innovate on top of this initial version by building new graphical interfaces for producing and interacting with code, and by extending the capabilities of querying Wikidata.

We are not sure why this was overlooked, particularly in light of the project requirements described above. In conversations with the Abstract Wikipedia team, team members suggested to us that this option was considered and rejected because Scribunto is an expert system and that Lua is based on English, and that these properties made it incompatible with the Wikifunction goal of broad accessibility. We think this is a bad argument, for the following reasons:

Lua has been successfully adopted by many projects whose primary language (as in, the language spoken by the users and developers) is not English. It is used successfully in hundreds of different language editions of Wikipedia.
Wikifunctions can provide localized, graphical user interfaces that generate Lua code behind the scenes (see Blockly, for example). Lua can also be used as a “transpilation” target for text-based programming languages (there is precedent for this on Wikimedia wikis). As an intermediate representation, Lua would be far more efficient and robust than an ad-hoc function composition language.

Recommendations

We would like Wikifunctions and Abstract Wikipedia to succeed. Both Abstract Wikipedia and Wikifunctions are ideas with legs, and could each deliver substantial value to the Wikimedia movement if realized successfully. They are also the first new wiki projects in ten years, and if they were to fail it could have a lasting chilling effect on innovation in the Wikimedia community.

We believe that in order to succeed, Abstract Wikipedia should be decoupled from Wikifunctions. The current tight coupling of the two projects together has a multiplicative effect on risk and substantially increases the risk of failure. To that end, we present separate recommendations for the two projects.

Wikifunctions

Wikifunctions should extend, augment, and refine the existing programming facilities in MediaWiki. The initial version should be a central wiki for common Lua code. The team can then focus on improving accessibility by lowering the bar for contributions. If a strong case is found for adding support for additional languages, it can be added later.
Don’t invent a new programming language. The cost of developing the function composition language to the required standard of stability, performance, and correctness is large, which diverts scarce resources from the project goals. It is better to base the system on an existing, proven language.
The Z-Object system as currently conceived introduces a vast amount of complexity to the system. If Wikifunctions consolidates on a single implementation language (as we believe it should), much of the need for Z-Objects goes away. If there is a need to extend the native type system provided by the chosen implementation language, it should be with a minimal set of types, which should be specified in native code. They likely do not need to be modifiable on wiki.

Abstract Wikipedia

The design of the NLG system should start with a specification of the Abstract Content, since it significantly constrains the design of the rest of the system.
Rather than present to users a general-purpose computation system and programming environment, provide an environment specifically dedicated to authoring abstract content, grammars, and NLG renderers in a constrained formalism.
Converge on a single, coherent approach to NLG.
If possible, adopt an extant NLG system and build on it. One of two alternatives we mentioned above is Grammatical Framework, which already have a vibrant community of contributors.
Alternatively, create a new NLG system adapted to the scope and contributor-profile of Abstract Wikipedia, as previously suggested by Ariel Gutman.

We remain inspired by the visions of Abstract Wikipedia and Wikifunctions and wish success for these projects.

Note: A response to this document is at Abstract Wikipedia/Google.org Fellows evaluation - Answer

References

↑ Vrandečić, Denny (2020). "Architecture for a Multilingual Wikipedia". arXiv:2004.04733.

[1] Vrandečić, Denny (2020). "Architecture for a Multilingual Wikipedia". arXiv:2004.04733.

[1]