Abstract Wikipedia/Semantics of Wikifunctions

This document will provide a specification of the evaluation and validation rules of Wikifunctions.


The system behind Wikifunctions, known internally as “the function model”, is already described in this page. However, the description does not go into enough detail about the way objects are evaluated and validated, and it is not easy to guess the missing parts, as the model is quite atypical. Therefore it is important to clarify these ambiguities for the development of Wikifunctions. The end-result should hopefully be a shared understanding of the meaning of all the functions that our users will create.


  • Clarify the semantics of Wikifunctions and understand what properties the model has.
  • Have something to which we can refer to for what our implementation should be doing (instead of relying on “the implementation as the definition”).
  • Uncover potential problems in the model, and propose potential solutions.
  • Determine whether the model fits the goals of Wikifunctions and Abstract Wikipedia.

In order to achieve this, we need to:

  1. Understand what the original design had in mind and what the current implementation is doing.
  2. Identify cases that are conflicting and cases that are underspecified.
  3. Reach a common agreement on the correct intended behavior for those cases.

The result is a description of the evaluation and validation rules of Wikifunctions, which this document attempts to provide.

Intended audienceEdit

This document is intended for the Wikifunctions developers, programming language enthusiasts who are curious about the inner workings of Wikifunctions, and potentially power users who need to understand some edge cases. Normal users do not need to understand the function model to be able to intuitively use Wikifunctions through the UI. Some use-cases like higher-order functions are also currently (as of September 2022) not possible through the UI and can only be achieved by direct interaction with the backend, which requires a good understanding of the model.

The document assumes familiarity with the function model and its use of ZObjects (syntax, normal vs canonical form, built-in types, ...). These are covered in some detail in the Abstract Wikipedia pages:

We will be using a mixture of natural language and formal notations to describe the semantics of the model. Although not required, it would be useful to have some familiarity with operational semantics, the branch of programming language theory that describes the steps in the evaluation of programs. The basics are well documented online; see for example this comprehensive introduction: CSCI 131 Fall '17.


Wikifunctions allows users to define and run functions in various supported programming languages (currently Python and JavaScript). These are run by the evaluator service. On top of that, the system allows the user to define functions by composing other functions together in a language agnostic way. These are run by the orchestrator service.

This extra composition layer is in fact an entirely distinct language on its own. It is a complex system, invented for the purpose of Wikifunctions. As such, it is important to understand this language and give it a proper definition.

The syntax of the language is given by ZObjects. These represent pieces of data as well as programs (function calls) that should be evaluated to produce a result. The meaning of what these objects represent and what values they produce is the semantics of the language. Together, the syntax and the semantics define the system as a whole, which is called “the function model”.

While the syntax is covered to a good degree in Abstract Wikipedia pages, the semantics is much less clear. The function model has atypical features that are not commonly found in other languages, and it is not easy to guess how they work. In this document, we attempt to clarify what various constructs of the system mean and how they are evaluated.


The model was originally designed to satisfy the following requirements:

  • Purely functional: Functions should have no side-effects and should always return the same value. They should be referentially transparent.
  • First-class functions: It should be able to take functions as arguments, define local/anonymous functions and return functions as results. In particular, it should be possible to do currying.
  • Lexical scoping: A variable (called argument reference in the model) should evaluate to the original value it had when it was first defined. There should be no additional capture through dynamic scoping.
  • Turing complete: While this is easy to achieve (we allow global functions to call themselves), in particular it means that we should not restrict the computational power of the language through typing.

Quick introduction to operational semanticsEdit

Operational semantics describe the sequence of steps in the evaluation of a program. In this section we give a brief overview of how operational semantics look like for the lambda calculus. Readers familiar with it can skip it and jump to the Evaluation section.

Small-step rulesEdit

Small-step semantics define the evaluation in terms of single step relations  , meaning that the term   reduces in a single step to  . Full evaluation is the result of a sequence of such steps   where   is a value which cannot be evaluated further.

In  -calculus, evaluation rules are usually defined in terms of substitution:



The first is a rule without any assumption, and the second is a rule with one assumption, written above the horizontal line. These read as:

  • The application of an abstraction   to an argument   reduces in a single step to the body   where   is replaced by  .
  • If the function part   of an application   reduces to   then the application reduces to  ." In other words, we first evaluate the left subterm of an application recursively. This is necessary in sequences of applications (i.e. currying), where the function part must be evaluated first before we can evaluate the function call.

Big-step rulesEdit

The previous rules are so called “small-step” semantics because they define evaluation one step at a time. Evaluation can also be described with big-step semantics, which define evaluation all the way down to a value in one relation.


This reads as: “if   evaluates to the function   and if  , where   is replaced by  , evaluates to the value   then the application   evaluates to  ”.

Alternatively, the rule can be seen as a relation between input and output of an interpreter: “to evaluate the application  , first evaluate  , which gives  , then evaluate   where   is replaced by  , which gives the result  ”.

The two concepts of small-step and big-step semantics are linked together as one would expect: big-step evaluation is the result of small-step evaluation repeated until we get a final value.


(where   is a sequence of zero or more   steps)




(where   means there are no more   steps that can be applied)


Alternatively, rules can be equivalently defined in terms of evaluation environments (a.k.a. scopes) instead of substitutions: we evaluate a term   in an environment   that maps free variables to their value. The evaluation step becomes a relation on pairs of terms and their environment,  , meaning that the term   in the scope   reduces in a single step to the term   in the scope  . A pair of a term and its scope is called a closure, and   is itself a mapping from variables to closures. The rules for the   calculus can thus be written as:




(where   means   is extended with   mapping to the closure  , with newer additions of the same variable shadowing older ones)

Note that the results of evaluation here are closures. If the result of an evaluation is a  -abstraction, we do not evaluate inside the function, but instead keep the environment with it.

Notice also that, in an application, we add to the scope the argument   together with its original scope  , and that we restore that scope when retrieving the argument  . The precise bookkeeping of scopes is crucial to get correct results and avoid issues like dynamic scoping. An easy way to think about this is that each term should always be attached to its own environment, which captures the values of all of its free variables.

Environments can be shared but are immutable and they can be thought of as a persistent list. They can be extended to bind more variables for new terms but that does not modify the environment of terms that have been previously attached. Implementations can still use interior mutability in practice, for example to cache evaluation results, but the exterior immutability provides invariants that simplify reasoning about environments and their implementation.

Advantages of using environmentsEdit

While this definition might appear more complex, it is closer to how an interpreter for the lambda calculus would be implemented, as using environments is more efficient than substitutions. Evaluation using substitution requires traversing the same term multiple times, while evaluation using environments doesn’t (and in fact it can be shown that it only ever needs to operate on terms directly written by the user).

Another advantage is that there is no need to worry about alpha-renaming and free variable capture because there is no substitution or evaluation inside the body of functions.


All these different approaches to defining the operational semantics of the lambda calculus are equivalent; it can be proved that if a term   evaluates to a value   according to one of the semantics above then it also evaluates to   according to the others.[1] This is useful because it shows for example that the implementation of an interpreter with evaluation environments gives the correct result and does not change the meaning of programs.

Using those definitions, one can go further and show various properties of the language. For example, they can be used to prove progress (valid terms do not get stuck during evaluation) and type preservation (valid terms evaluate to a value of the same type), which together provide type safety. While we will not go as far with the function model, it goes to show the potential of taking a rigorous approach.


The concepts presented in this section are very useful for describing the semantics of programming languages in a precise and succinct way. They are standard in programming language theory and are well documented online. See for example https://cs.pomona.edu/~michael/courses/csci131f17/lec/Lec16.html for a comprehensive introduction.


We now try to define similar evaluation rules for Wikifunctions. The implementation mixes small-step (e.g. in the resolve() method) and big-step (e.g. in the execute() method) rules, so we will freely mix and match both styles. We also sometimes use normal form instead of canonical form as a short-hand in certain parts to save space, but unless explicitly stated otherwise, the rules are defined on objects in canonical form.

Use of types in evaluationEdit

The main thing to note about evaluation in Wikifunctions is that it is guided by the Z1K1/type field of ZObjects. These types are different from the traditional standard notion of types. In particular there are three special types – Z7/Function call, Z9/Reference, and Z18/Argument reference – that guide the evaluation. For example, if a ZObject has type Z7/Function call then it represents a function call, and will evaluate to the result of that function call (which might have a different type).

This is fundamentally different from traditional programming languages, where function calls and variables are syntactic constructs. They do not usually have their own type; a function call takes a function of type   and an argument of type   but the whole construct has type  ; a reference to an object of type   has type  , etc. This is the entire premise behind type theory and type checking.

The current representation as ZObjects leads to some weird situations and questions: can a function expect an argument of type Z7/Function call or of type Z18/Argument reference? What would that even mean? Doesn’t that break confluence?

One could say that we forbid such objects. In fact it is easier to think of these as special types that are only allowed to appear as literal references in the Z1K1/type field of ZObjects.


Evaluation takes place in a global scope   mapping Zids to ZObjects and a local scope   mapping keys to ZObjects. The global scope corresponds to the objects stored on MediaWiki and the local scope corresponds to the value of function arguments. The global scope is constant throughout the evaluation of a single object, while the local scope is extended as function calls are evaluated.


The first thing we need to answer before writing down the evaluation rules is, "What are the values of the language?" What do objects look like after they have been evaluated? These are called normal forms in programming language theory, but to avoid confusion with the notion of normal form syntax of ZObjects, we will just call these values.

As with lambda-calculus, we have to differentiate between different forms of values corresponding to the different stages of evaluation.

The first form we define corresponds to reduction at the outermost layer of a ZObject, and we will call this head-value. A ZObject is a head-value when it is either:

  1. A string literal, or
    where   is not Z9/Reference, Z18/Argument reference, or Z7/Function call

The second form we define corresponds to reduction in the subobjects, to ensure that all fields have been evaluated before we return the result to the user, and we call this full-value. An object is in full value form when it is either:

  1. A string literal, or
    a)   is not Z9/Reference, Z18/Argument reference, or Z7/Function call
    b) each other field   is a full-value, except if:
    i)   is an identity field (e.g. Z4K1/identity (Type), Z8K5/identity (Function), Z40K1 (Boolean))

Note that we do not require the type field to be itself evaluated. Otherwise this can lead to a lot of expansions, resulting in very big objects that are not very readable. Therefore, we only require that they do not trigger reduction of the outer object. This allows us to return objects where the type field is not fully evaluated, while retaining flexibility as to the degree of how evaluated it is (from partially to none at all).

The 2.b.i. condition is needed currently for technical reasons, but we might get rid of it in the future if we end up not relying on identity fields.

For example, the following is ZObject is a full-value representing the list containing the element “42”:

  Z1K1/type: {
    Z1K1/type: Z7/function call
    Z7K1/function: Z881/Typed list
    Z881K1/type: Z6/String
  K1: "42"
  K2: {
    Z1K1/type: {
      Z1K1/type: Z7/function call
      Z7K1/function: Z881/Typed list
      Z881K1/type: Z6/String

The following is a head-value but not a full-value because the inner field K1 is not a full-value as it is still a function that needs to be evaluated:

  Z1K1/type: {
    Z1K1/type: Z7/function call
    Z7K1/function: Z881/Typed list
    Z881K1/type: Z6/String
  K1: {
    Z1K1/type: {
      Z1K1/type: Z7/function call
      Z7K1/function: Z10000/Add
      Z10000/first number: "13"
      Z10000/second number: "29"
  K2: {
    Z1K1/type: {
      Z1K1/type: Z7/function call
      Z7K1/function: Z881/Typed list
      Z881K1/type: Z6/String

Evaluation rulesEdit

References and argument referencesEdit

We can describe the rules for references and argument references are as follows:



(Here Z18/Argument reference, in canonical notation to save space, stands for Z1K1/type : Z9/Reference, reference id : Z18/Argument reference)

These read as:

  • If the global environment   maps the ZID   to the zobject  , then evaluating the Z9/Reference   in a local environment   reduces to evaluating   in the empty local environment”.
    • Alternatively: “to evaluate a Z9/Reference   in a local environment  , lookup the ZObjects   that   maps to in the global environment, and proceed to evaluating   in the empty local environment”.
  • If the local environment   maps the argument key ID   to the closure  , then evaluating the argument reference   in the local environment   reduces to evaluating   in the local environment  .
    • Alternatively, “to evaluate a Z18/Argument reference  , lookup the closure   that   maps to in the local environment, and proceed to evaluating   in the local environment  ”.

Using the correct environment in the right place is crucial to getting the correct result. Notice how we clear the local environment when we dereference a global reference and restore the original environment when we dereference an argument reference. This has been the source of bugs in the past, and these rules help us make sense of the problem and fix and prevent such bugs (see task T309195 and its subtasks for example).

Literal reference types vs expanded typesEdit

See how we use the type field to determine that an object is a reference or an argument reference. Note that currently only literal ZIDs are accepted as the type in these rules. For example, Z18/Argument reference, which is itself a reference, must not be expanded to its definition. Otherwise the argument will not be resolved and the evaluation would get stuck. This might also have an impact on generics.

This also means that we cannot evaluate arbitrarily inside objects; the following simple property does not hold, as it would break confluence:


Would it be possible to have different rules such as the above holds? This might be possible by defining an equivalence relation on objects:



Because continuously expanding references would not terminate, this relation would need to be defined in a recursive way, comparing the left and right side, expanding them as needed if they don’t match and recursively comparing each field. This problem might not be decidable (Post correspondence problem?), in which case this strategy would only be an incomplete heuristic, but it would be similar to how types are compared in systems that allow computations in types.

Function callsEdit

The rule for function calls can roughly be described as


We first evaluate the Z7K1/function field, down to a Z8/Function. This is needed because it can be a reference or even itself a function application in the case of currying. We then resolve the Z8K/implementations field down to a list of Z14/Implementation. This requires in practice chaining multiple evaluations as lists are represented as a linked list of heads and tails but we do not need to go over that in detail here. Finally, an implementation is chosen and executed with the arguments of the function call added to the scope. We write this   and explain it in the next section.

The way the implementation   is chosen is not defined for now. Eventually it will take into account the results of testers as well as performance metrics.

As noted previously, there is no need to worry about alpha-renaming or free variable capture because we are using environments so there is no substitution or evaluation inside the body of functions.

Once again, it is important to use the correct scope in the correct context. Notice that evaluating the Z7K1/function field results in a modified environment  , and it is in that environment that the list of implementations must be resolved, as they may refer to free variables that have been added there. Similarly, resolving the list of implementations results in each implementation having its own scope, and it is to that scope that we should add the arguments of the function.


A Z14/implementation can be defined in one of three ways:

  • By composition (the Z14K2/composition field is set)
  • In a supported programming language (the Z14K3/code field is set)
  • As a builtin (the Z14K4/built field is set)

A composition implementation defines the body of a function as the ZObject that it should return, possibly referring to the arguments of the function through Z18/function argument.

This is where local scopes come into play. To evaluate such an implementation, we just add the arguments to the current scope, and evaluate the body.



Code implementations are defined in a supported programming language such as Python or Javascript. The evaluation of such an implementation is done by delegating to the interpreter or compiler of that language.

Arguments are serialized to the target language (an operation we write as  ), the code is executed, and the result is deserialized back to a ZObject in Wikifunctions (an operation we write as  ). As such, it is a form of compilation or denotational semantics.


For this, arguments need to be fully evaluated. No closures can remain, and as such, some types cannot be represented, such as functions. We can fail if we ever encounter such an argument.

The serialization of basic types such as strings is straightforward, but more complex types are currently just translated as records representing the ZObjects. This is probably not very usable in its current form and there is probably a lot of room for improvements there (libraries? custom serialization/deserialization?).


Built-in implementations are just entry points for the interpreter to do arbitrary computations. They allow the definition of basic building blocks that would be hard, impossible, or too inefficient to define using composition alone.

We can define rules for each of the builtin implementations (for example, the builtin implementation for the Z802/if function evaluates the first argument, and if that argument is Z41/true, reduces to the second argument, otherwise, it reduces to the third argument), but we will not go into details here.

Generic typesEdit

A generic type is represented by a function that returns a type. For example, the type of a list of strings is given by the application of a function Z881/Typed list to the type Z6/String, which returns the definition of the type of lists of strings.

This raises the question: what happens if the returned type object is Z7/Function call, Z9/References, or Z18/Argument reference? Should it trigger the corresponding evaluation rule? The same question comes up if the type is a Z9/Reference to one of these special types, or a Z18/Argument reference that resolves to one of these special types.

An easy answer to this would be no, that the special type should be there from the start. However, this breaks confluence. While one could say that this is an improper use of ZObjects, the result does come off as surprising. In particular, this effect is used in one of the propositions for generic function types.

An alternative could be to resolve the type field and trigger the corresponding rule if it resolves to Z7/Function call, Z9/References, or Z18/Argument reference.


This can become a bit tricky. In particular:

  • We need to modify the definition of values to disallow non-resolved Z1K1/type fields.
  • Z7/Function call, Z9/References, or Z18/Argument reference are themselves references that could be expanded, so we need to prevent that, for example by not expanding builtin types.
  • Z1K1/type field might now be more expanded than we would have liked.
  • We have also deliberately left local scopes out for readability but they should also be carefully handled.

For example, here is an alternate definition of head values that could work for this purpose:

  1. A string literal, or
  2.   where
    a)   does not reduce to Z9/Reference, Z18/Argument reference, or Z7/Function call

And here is the alternate definition of full values:

  1. A string literal, or
  2.   where
    a)   does not reduce to Z9/Reference, Z18/Argument reference, or Z7/Function call, and
    b) each other field   is a full-value, except if:
    i)   is an identity field (e.g. Z4K1/identity (Type), Z8K5/identity (Function), Z40K1 (Boolean)).

Head-reduction and full evaluationEdit

We define head-reduction as the union of the rules for references, argument references, and function calls:  

We repeatedly apply head-reduction for as long as possible. Once there are no head-reduction rules that can apply, we have a head value. We then need to recursively evaluate each field to obtain a full value:



Notice that we do not evaluate the Z1K1/type field. Doing so naively would lead to infinite recursion, as there will always be a reference to expand (e.g. Z4/Type). Alternatively, we could define a different evaluation rule for types, which would not expand references to builtin types, but that remains to be done. Moreover, it might be more readable to return types in an unevaluated form.

Evaluation strategyEdit

The evaluation strategy corresponds roughly to normal-order reduction to weak head-normal form, which informally means argument evaluation is call-by-name (except for native code and some builtins). Optimizations can be done in the implementation as long as it gives the same result. Currently the orchestrator uses caching to avoid evaluating the same argument twice (similar to Haskell thunks); this makes the strategy call-by-need (a.k.a. lazy).

Laziness is necessary because “if” expressions are represented as function calls. Some fields are also not meant to ever be reduced, such as identity fields (Z4K1, Z8K5, etc.), and Z1K1/type fields also benefit from not being fully resolved to avoid huge objects.

An eager/strict call-by-value evaluation strategy could also be done. We would have to be careful how to handle fields that need to be lazy; for example, by hiding them behind lambdas (thunks) or introducing some native support for laziness.

A strict evaluation strategy would have some consequences on the result because of errors (see next section); if evaluating an argument fails we fail the entire computation, but in call-by-name we would not fail if the argument is not used.


Errors in Wikifunctions are themselves structured ZObjects. Errors are represented as Z5/Error and a success result or error is represented as Z22/Evaluation result.

If at any point we encounter an error (e.g. an expected field is missing, an object fails validation, or even some internal problem such as the evaluator backend not being reachable!), we produce an error and propagate it all the way to the top, possibly chaining it with other errors. This makes errors similar to exceptions.

Note however that currently it is not possible to intercept errors (i.e. exception handling). A function that expects a Z22/Evaluation result does not intercept errors.

It is possible to give precise semantics for errors but we feel it is not worth going into the details at this point and leave this implicit.


Confluence describes whether evaluating terms in different ways always yields the same result. It is a fundamental property of lambda calculus, and gives the freedom to choose any evaluation strategy we want. In particular it allows us to evaluate any subterm without worrying about the evaluation order.

While we do think that the Wikifunctions model is mostly confluent, there are multiple reasons why confluence does not always hold:

  • If evaluating a function argument yields an error, the error is propagated. However with lazy evaluation that argument might not be evaluated, therefore missing the error entirely.
  • Expanding the Z1K1 field of a reference, argument reference, or function call might prevent that object from further evaluating because we do not recognize the type anymore, as described previously.
  • Using weird argument types in functions such as Z9/Reference, Z18/ArgumentReference, and Z7/Function call would cause validation to fail if we evaluate the argument before validating it, as described previously.

Non-confluence can be partly addressed by specifying the evaluation order, and in some sense this is what we have been doing so far in this document. For example, currently the system does lazy evaluation of function arguments. However it does not address all problems, and whether this evaluation strategy is the right one remains to be decided.

In particular lazy evaluation could lead to surprising results in the presence of side effects (such as errors) or non-confluence, as a slight change of the evaluation order (due e.g. to refactoring of a term) could lead to different results. A strict call–by-value strategy might be more appropriate.


Use of types in validationEdit

Most compiled languages (e.g. C++, Java, Haskell) have statically-checked types (i.e. checked ahead of evaluation). These provide type safety. Most interpreted languages (e.g. Python, Javascript) only check types at runtime (if at all). These provide very little guarantees; nothing prevents you from eventually finding a string as an argument to your number addition during evaluation.

In Wikifunctions, ZObjects are validated against Z4/Type objects to make sure they are well-formed. A ZObject is valid with respect to a type when:

  1. It has the fields described in the Z4K2/keys of the type, with those fields being recursively valid.
    • It might not be necessary to have all the fields described in the type; they could be optional, in particular to accommodate union types. For example, this is currently used in Lists, where a list object either has both K1/head and K2/tail fields, or neither, to indicate that it is empty.
    • It should not have fields that are not described in the type though, unless it is a Z7/Function call. That is because function arguments are supplied as fields in the object.
  2. It passes the Z4K3/validator of that type (for example the type of positive integers would verify that the value corresponds indeed to a positive integer).

The first case can be seen as a special builtin version of the second case (and is in fact how it is currently implemented).

Since a validator can be any arbitrary function, validation has to be performed dynamically at runtime and is in general undecidable.

In addition, types are also useful for other reasons:

  • They help users when writing and debugging implementations.
  • They guide the UI in how to display ZObjects to the user.
  • They can theoretically be used by the system itself, e.g. for optimizations.

Validation rulesEdit

NOTE: Validation is currently underspecified and ambiguous. It is not clear what it means for an object to be valid and how validation should be done in a lazy evaluation setting.

There are four different situations where an object can be validated:

  • Against its own Z1K1/type, to make sure it has the type that it says that it has.
  • Against the declared type of the field in the parent object.
  • In a function call, to make sure that arguments have the type that the function expects in its signature.
  • When a function returns, to make sure that the returned object has the type that the function declares in its signature.

In each of these, we validate one ZObject against one type. We represent this by the following relation:  , noting that the type can have its own scope  , which could be different from the scope of the object  .


The main rule for default validation is:


That is, we evaluate the type   into a type definition   to extract the list of key declarations   (notice that each key declaration has its own scope), and we recursively check that each field in the object validates against the type in the key declaration.

Notice that we completely ignore the Z1K1/type field here. That is because we can validate the ZObject against different types, for example when passing the object to a function. The type that is checked against therefore depends on the context.

Non-resolved objectsEdit

Because the evaluation strategy is call-by-name (a.k.a. lazy), objects might not be fully resolved during validation. It is not clear how to handle this.

This problem might be avoidable in a call-by-value (a.k.a. eager, strict) evaluation strategy, because we could then be sure that we always run validation on fully evaluated objects.


Notice that when we validate objects against a type, we do not validate that the type is itself a valid Z4/Type. In systems with higher-order/dependent types, this is addressed by either adding a check on the type (e.g. when it comes from the user), or showing via meta-theoretic argument that the type being used must necessarily be valid (e.g. if   then it must be the case that  ), thereby avoiding an infinite validation problem.

Whether a similar result can be achieved in the Wikifunctions model remains to be determined. We could add constraints that validate the type against Z4, but we need to be careful to do it in a constrained enough way to avoid infinite validation.

Compositional validationEdit

The validation rules can lead to a lot of re-validation of the same object, which leads to performance issues. It is not clear how to do validation in a compositional way, that is where the validation of an object in one context can be re-used in another context. This is something we are currently working on.

One way to do this is to have a notion of type equivalence: we only validate an object once, when we evaluate it, for example against its declared type, then whenever we need to validate it against another type (for example when we pass the ZObject as an argument to a function), we just check that the types are equivalent and avoid validating the object again. This is similar to the common typing rule in dependently-typed lambda calculus:


(If   is a function of type   and the argument   has type   then if   is equivalent to   then the function application   is well-typed and has type  )

In its simplest form, equivalence could be checking that the types are equal, field by field, however this would probably be too restrictive. Another way could be to check the identity field of the types, but not all types have a well-defined identity.

An alternative is to recursively define an equivalence relation that potentially evaluates either side when needed (this is similar to what is done in dependently-typed languages). It would only be an approximation as the general problem is probably undecidable, but it might be good enough in practice. This is still work in progress.

Relation to other type systemsEdit

Even though there is no static type checking, and despite the types in the function model not corresponding to the usual notion of types in type theory, it is still interesting to ask what types we can express and what other type systems are similar, in particular those of the lambda calculus.

Given that the type of Z4/Type is Z4/Type and that we allow arbitrary computations in types, this corresponds roughly to the lambda-star system (the universal pure type system; see also system U). This means that type-checking is in general undecidable. Other type systems (e.g. system F, which is the basis for functional programming languages such as Haskell and OCaml, or systems with dependent types such as the Coq theorem prover, etc.) avoid this by restricting the types that can be formed in a way that ensures that computation in types always terminates.

It is interesting to note that all forms of type dependencies (i.e. dimensions of the lambda cube) are actually in use:

  • Types that depend on types, to form generic types such as TypedList,
  • Terms that depend on types, to form polymorphic functions such as those that operate on TypedLists[2],
  • Types that depend on terms, to form the type of error values from Error Types.

It is not clear what properties we can derive from the types. For example, type preservation clearly does not hold, as mentioned in a previous section. It would be interesting to see if one can define an alternative system on top of ZObjects to do proper type checking, and see what properties it would have.

Observations and recommendationsEdit

We attempted to provide a specification for the semantics of the function model by studying the implementation of the orchestrator and bringing little modification to it. A lot of difficulties were encountered along the way, some stemming from fundamental design decisions.

There are various changes one could bring to the design of the function model and its implementation, varying in scope and impact. The present section attempts to summarize them. It is the sole opinion of the author.

The ZObjects systemEdit

A lot of complexity comes from using ZObject as the basis for everything. Mixing types for syntactic and semantic purposes can be confusing and introduces a lot of complexity. The questions and problems that arise from the Z7/Function call, Z9/Reference, and Z18/Argument reference are difficult to resolve. Having types be just like any other ZObject requires arbitrary computation for any form of type checking. Despite the drive for simplicity, even type uniqueness is not guaranteed (hence the Z1 type). All of this adds a lot of complexity for little apparent benefit.

It would be much simpler to take a standard and proven system rather than invent a new esoteric one with little guarantees. Various systems based on the lambda calculus exist. For the purposes of Wikifunctions, if polymorphism is needed then something like system F could be sufficient (more precisely system Fω to also include type operators, or a weaker variant such as Hindley–Milner although type inference is not needed). Moreover, it has been successfully extended with various mechanisms such as algebraic datatypes (e.g. lists), records, exceptions for handling errors, etc. that would fit the needs of Wikifunctions.

Separating objects and types into different layers could greatly simplify the system. As tempting as it is to represent objects and types using the same concepts, it introduces a lot of complexity, and systems with dependent types have extensive metatheoretic results and smart implementations developed to make them work.

Using different syntactic constructs than ZObjects to represent function calls, references, and argument references would also be beneficial. It would help dealing with the confluence problems and give less surprising results.

At the very least, I think the following points should be considered and consciously decided upon:

  • Non-confluent cases: are we ok with having them? What should the behavior be?
  • Evaluation strategy: is the lazy strategy the one we want? Is it part of the specification? If not, how do we reconcile and deal with the non-confluent cases?
  • How should lazy fields be treated? Are identity fields really necessary or would it be better to let that be a meta-property that is computed and kept track of by the orchestrator?
  • The intended semantics of validation: what does validation mean? What should the behavior be on partially evaluated objects? Would a strict evaluation strategy be more well-suited? Should types themselves be validated?

Choice of programming languageEdit

Javascript is not well-suited at all for writing production-quality interpreters (especially for complex systems such as this one): it has dynamic type checking (if at all); there is very little code hygiene and invariants we can depend on; it makes refactoring very difficult to get right, at least without overly extensive testing.

Typescript would be an obvious first alternative and one that I would strongly encourage the team to try. While I do not have experience with it, the similarity with Javascript would obviously help, and having stronger type guarantees would go a very long way in providing a maintainable implementation. It might even be possible to adopt it gradually (e.g. by incrementally refining the type annotations), which would also smooth the migration process.

No builtin sum/union types also make it difficult to represent ASTs. Functional programming languages are much better suited for that (ML was even invented with that purpose in mind), and even modern imperative languages such as Rust support that. However, not enough experience in the team or support in the Wikimedia ecosystem could be a big hurdle in adopting them.

A possible alternative I came across could be Reason, which appears to be a javascript-like OCaml dialect. From its documentation, “Reason is a programming language powered by OCaml's strong type system, and has a syntax designed to feel familiar to people coming from JavaScript or C-family languages. [...] The powerful type system underlying Reason will reduce bugs, increase maintainability, and improve the refactorabilty of your code.” Again, not prior experience here, but something that might be worth looking into.

Structure of the interpreterEdit

The orchestrator is currently written in a top-down recursive style. This is natural and intuitive as it is close to what we think of when we evaluate a program. However it is important to note that this might lead to stack overflows in long evaluations.

There are various techniques that can be used to avoid this. In particular converting the orchestrator code to continuation-passing style would make the code tail-recursive, which would already be enough in languages with tail-recursion optimization. Defunctionalization would go one step further and get rid of higher-order functions. The result would essentially be the implementation of an abstract machine for the language, which can be implemented efficiently in an imperative paradigm without using recursion.

Note that the resulting code might be significantly less intuitive than the original and harder to modify (without going through the transformation again). Therefore I would only resort to these techniques if addressing performance issues such as stack overflows becomes a priority.


  1. We did not go into all the details, but if the body is a function, obviously the results will not be exactly the same when using environments since with environments it results in a closure. In practice, there are two main ways this is dealt with: either the result is opaque to the user (e.g. in Python it is displayed as <function <lambda> at 0x7f6b3cfa1ea0>), or the body of the function is opened and recursively evaluated through a process called reification but which needs careful alpha-renaming to avoid variable capture (e.g. in Coq and other systems with dependent types).
  2. Although in order for this to truly be the case we are still missing generic function types.