Grants:Project/Experimental endpoints for wikistats

Experimental endpoints for wikistats
summaryA clearly defined configuration for how to set up a development environment for new endpoints, and in particular endpoints publishing data from Druid for reuse in Wikistats2. The development environment should be made so it is easy to include it in the production environment.
targetAll wiki projects that needs additional metrics that are not provided in the current setup.
type of granttools and software
amountUSD 15.000
type of applicantindividual
contact• jeblad(_AT_)
this project needs...
created on12:59, 30 November 2018 (UTC)

Note; this project is written as a response to observed difficulties with Grants:IdeaLab/Measure replacement rate among the admins, and could be incomplete or even completely wrong. I am although fairly confident that the description is correct.

And so it goes; wikitech:Nova Resource:Analytics and OpenStack browser: Project: Analytics. Part of it is already done, it is only the write-up and Vagrant-instance that is missing.

Project idea


What is the problem you're trying to solve?


What problem are you trying to solve by doing this project? This problem should be small enough that you expect it to be completely or mostly resolved by the end of this project. Remember to review the tutorial for tips on how to answer this question.

This project started as an attempt to build an additional metric for replacement rate among the admins. (It is described at Grants:IdeaLab/Measure replacement rate among the admins) During work on that proposal it became clear that the WMF-server used for endpoints was closed for other contributors. Development of new metrics were thus effectively blocked unless done in separate environments, but there are no proper description of such an environment.

This project idea is about how to set up such an environment, and to make a few endpoints as test cases.

What is your solution?


For the problem you identified in the previous section, briefly describe your how you would like to address this problem. We recognize that there are many ways to solve a problem. We’d like to understand why you chose this particular solution, and why you think it is worth pursuing. Remember to review the tutorial for tips on how to answer this question.

The perhaps most common way to make a development environment, except for just simply dumping a dev folder somewhere on the machine which is quite common, is to set up a Vagrant environment and expose necessary folders as NFS-mounts.

There should be a complete and exhaustive description of how to set up such a local development environment, and the environment should be portable to Toolforge and/or Cloud VPS as a puppetized instance. (wikitech:Help:At a glance: Cloud VPS and Toolforge, most likely a Cloud VPS instance.) The primary focus is the Vagrant development instance for private use, then a Toolforge/Cloud VPS instance for lightweight public use, and finally use on the WMF-private Druid server as an option.

It might be necessary to limit the Vagrant development instance, or provide additional tweaks, as some of the services are pretty demanding. A working development environment could for example run natively for the duration of heavy tests. A working development environment would probably include Druid, Hadoop, and optionally Kafka and/or Hive.

There must be some kind of mechanism to harvest the configuration and make a single configuration for the WMF-private server, otherwise it would be necessary to manually adapt the code before it can be reused.

Project goals


What are your goals for this project? Your goals should describe the top two or three benefits that will come out of your project. These should be benefits to the Wikimedia projects or Wikimedia communities. They should not be benefits to you individually. Remember to review the tutorial for tips on how to answer this question.

  • Make a sufficient development environment for providing new endpoints for community metrics, and
    • provide a few basic health metrics that can be used as good examples
    • make it feasible for contributors to test and develop new metrics

Project impact


How will you know if you have met your goals?


For each of your goals, we’d like you to answer the following questions:

  1. During your project, what will you do to achieve this goal? (These are your outputs.)
  2. Once your project is over, how will it continue to positively impact the Wikimedia community or projects? (These are your outcomes.)

For each of your answers, think about how you will capture this information. Will you capture it with a survey? With a story? Will you measure it with a number? Remember, if you plan to measure a number, you will need to set a numeric target in your proposal (e.g. 45 people, 10 articles, 100 scanned documents). Remember to review the tutorial for tips on how to answer this question.

  1. A page exist that describe in a clear and concise way how to create a basic development environment
    1. I will write this page
    2. Other users will use this page when they start developing a new metric
  2. An environment exist that implements at least one of the algorithms from admin replacement rate
    1. I will code this
    2. Other users will use this code during initial work on algorithms for other metrics
  3. An environment exist that implements at least one of the endpoints from admin replacement rate
    1. I will code this
    2. Other users will use this code during initial work on endpoints for other metrics
  4. A portable instance is provided for Toolforge/Cloud VPS
    1. I will code this
    2. Other users will use this code during initial work on portable instances for other metrics

Do you have any goals around participation or content?


Are any of your goals related to increasing participation within the Wikimedia movement, or increasing/improving the content on Wikimedia projects? If so, we ask that you look through these three metrics, and include any that are relevant to your project. Please set a numeric target against the metrics, if applicable. Remember to review the tutorial for tips on how to answer this question.

No, not at this point.

Project plan




Tell us how you'll carry out your project. What will you and other organizers spend your time doing? What will you have done at the end of your project? How will you follow-up with people that are involved with your project?

This project is strictly limited to building and testing a development environment. Building specific metrics are slightly outside the scope, but will be done to a very limited degree as examples. Available spare time, if there should be any, will be used on implementing health metrics.



How you will use the funds you are requesting? List bullet points for each expense. (You can create a table later if needed.) Don’t forget to include a total amount, and update this amount in the Probox at the top of your page too!

Developer: 15.000 USD – this will cover three months full time or six months half time

Community engagement


Community input and participation helps make projects successful. How will you let others in your community know about your project? Why are you targeting a specific audience? How will you engage the community you’re aiming to serve during your project?

Not sure if there are any specific audience except those individuals that are interesting in statistics and specific metrics. Participation will be easier when there is a well-defined development environment, and especially if there are specific endpoints with necessary metrics.

Get involved




Please use this section to tell us more about who is working on this project. For each member of the team, please describe any project-related skills, experience, or other background you have that might help contribute to making this idea a success.

Some users expressed interest in the health metrics proposal Grants:IdeaLab/Measure replacement rate among the admins, but this project is not the same and whether they are still interested are unknown.

Community notification


Please paste links below to where relevant communities have been notified of your proposal, and to any other relevant community discussions. You are responsible for notifying relevant communities of your proposal, so that they can help you! Depending on your project, notification may be most appropriate on a Village Pump, talk page, mailing list, etc. Need notification tips?



Do you think this project should be selected for a Project Grant? Please add your name and rationale for endorsing this project below! (Other constructive feedback is welcome on the discussion page).