Grants:IEG/Lua libs for acceptance test-driven development

IEG/Lua libs for acceptance test-driven development
summaryCreate acceptance tests as Gherkin-snippets on wikipages.
targetAll Wikimedia-projects that use (testing of) Lua modules.
strategic priorityincrease participation (in lua code quality)
amountTotal amount requested is USD 12,400
contact• jeblad(_AT_)
this project needs...
created on22:06, 5 April 2016 (UTC)
This proposal depends on the outcome of Grants:IEG/Lua libs for behavior-driven development, and the uptake of proper testing by the communities. It is likely that there should be some time for the communities to adapt their behaviour before this project is started, as this extends testing so users can make inferences about the libs.

Project idea


What is the problem you're trying to solve?


To be able to communicate between the developer and the community in an efficient, clear and distinct way, some kind of the acceptance tests should be used. We need something simpler than raw spec code, that is something for the developer, we need something that a non-developer can fiddle with that gives understandable results. The system used should also give an impression that "you can fiddle with me, it won't break anything!" This can be done with Gherkin, which is a language made for this kind of testing.

(Note that when we move from tests into the realm of the user community they will be more interested in results from templates than Lua modules. That would imply that we need methods to facilitate testing of content after template expansion, but how to do that is slightly outside the scope out this project.)

What is your solution?


Because Gherkin is an extremely simple Domain Specific Language (also called "Darn Simple Language") we should be able to implement something similar as a tag-function which then builds a JSON-package and feeds that into some Lua-function, possibly the BDD-module. Out of this we would get a short report from the module, possibly with an option to expand it into a larger report.

The idea is to have a tag function that goes on any wikipage, and holds a short Gherkin snippet that describes a test run by a specific text fixture. The text fixture is named through an attribute and will typically invoke a Lua module to do the actual testing. When the Gherkin code is shown by the tag function parts of the code will go green or red, or use other visual cues on what works and what not. (The most interesting solution seems to be to color code the keywords.)

As the Wikimedia-universe contains lots of languages it is important that the chosen solution can be internationalized. The Gherkin language is defined for a lot of languages, and we can reuse work already done.[1]

Project goals


Improve participation in defining goals (defining acceptance tests) for the modules and templates, increase the overall code quality, and make it possible to maintain a minimum code quality for our modules and templates. As such this project is about making a testing environment for the community.

Note that code quality is more than mere testing, but this project is limited to testing alone and then with a focus on testing within the community realm.

The primary goal and outcome of the project would be a very simple method that makes it possible for users in the community to fiddle with tests and outcome of tests. The purpose of that is to enable the community to define what is acceptable behavior of a system and criteria for that acceptable behavior. If they write supported tests then they should immediately get a response that is either red or green, and if not the tests should go gray. If they go gray they can ask a developer to help them set up the test, or even make a claim that the system is defunc in some way.

Even if it would be necessary to do some adjustments in the outcome of Grants:IEG/Lua libs for behavior-driven development, those code changes are only a consequence of our attempts to make a working test environment.

The author is not aware of any alternative libs for acceptance testing within the Wikimedia-projects.


Users in the community might not accept new solutions
This is probably my biggest concern. It is often rephrased as "Not invented here". To avoid this it could be an idea to make some showcases that directly compares different ways to write tests. One thing that is in favor of the proposed solution is that the number of such tests the users can fiddle with are close to zero, if they exists at all.
The new code might not be finished within time
Always a possibility, and I'm very good at overestimating my own work progress. It seems to me that most of this should be doable within the given time.
The new code might pose a security risk
Always a possibility, but the real security risk is within the PHP extension code, which will be limited. The Lua-code itself should pose no more risk than existing code, and note that Lua modules can be edited by all. There will also be a limited security risk within the Javascript code.
The new code might create to high load on the system
Testing can be quite heavy, even if it happen on a limited number of pages. The tag function can although be included on transcludable pages and then spread out on a large number of pages, running the same heavy computation on all of them. To avoid this scenario the content of the tag could be used as an id for a caching mechanism and previous results reused if possible.

Project plan



Note that this project depends on completion of Grants:IEG/Lua libs for behavior-driven development, and would probably be an renewal as described on Grants:IEG/Procedures#Renewals.
  1. Refactor BDD-code to return an executable test fixture (now it returns the result)
  2. Add the tag function for Gherkin
  3. Adapt the language primitives from Gherkin to the localization files (avoid duplicate work by the translators)
  4. Add caching for the Gherkin snippet and the result
  5. Build the line-wise parser
  6. Export parsed JSON-data and run it trough Lua
  7. Present evaluated result (loop over testdata)
  8. Categorize result (use tracking categories)

An alternative to do 5 – 7 in PHP is to strip out the Gherkin-code and feed it into Lua as it is. It would be slightly slower, but open for more reuse within Wikimedia-projects.



The estimated workload is about 3 full-time person-months for an experienced developer; or 6 person-months at 50 %. This workload estimation is based on the main developer's previous experience with similar projects.

Budget breakdown

Item Description Commitment Person-months Cost
Main developer Developing and releasing proposed code Part time (50 %) 6 USD 12,400
There is no co-funding
Total USD 12,400

The item costs are computed as follows: The main developer's gross salaries (including 35 % Norwegian income tax) are estimated upon pay given to similar projects using Norwegian standard salaries,[2] given the current exchange rate of 1 NOK = 0.120649 USD, and a quarter of a year's full-time work.

Community engagement


Other than code review it is not expected that the community in general will participate very much in the development phase.

It is expected that it is necessary to get feedback on the very limited UI, and to get help with translation of system messages. The messages will be quite simple, and many of them can probably be reused from other open projects. (Ie. we will use many of the same messages as in Gherkin.[1])

If time permits some test could be set up as examples on a few core modules. That would be very interesting as it would be an enabler for the community in further testing of other modules.



The code will be available on-wiki, and possibly also in the code repo if a extension is necessary. Because of this the code is assumed to be maintained by the community.

Code will be developed with re-usability and maintainability criteria in mind. Thus the code will be documented and manuals and initial tutorials and examples will be made.

Measures of success

  • An initial, functional version of a usable testing environment, run from a discussion page (this indicates a working solution)
  • The Lua module is self-testing with a reasonable coverage (this indicates completion of development, aka "done")
  • Count on how many acceptance tests (this indicates usage after development)

Get involved



  • Jeblad – I'm a wikipedian with a cand.sci. in mathematical modeling, and started editing on Wikipedia during the summer of 2005.

Community notification


Please paste links below to where relevant communities have been notified of your proposal, and to any other relevant community discussions. Need notification tips?



Do you think this project should be selected for an Individual Engagement Grant? Please add your name and rationale for endorsing this project below! (Other constructive feedback is welcome on the discussion page).