Grants:Project/WCDO/Culture Gap Monthly Monitoring/Timeline/Extension

1. Centralizing Diversity Projects and Resources in One Page edit

For the project to grow, I propose to widen the scope and aim at analyzing and studying diversity in all the senses of the term, not only cultural diversity.

Therefore, the project would change its name into Wikipedia Diversity Observatory.

It would also include other projects and it would be more centered around communicating the tools and putting people in touch rather than presenting research and explaining the methods.

This implies creating a new page in Meta for the project and moving the previous contents. This new page would include a directory of projects working on understanding diversity in both people, content, and Wikimedia spaces and channels in general. These could be research projects, surveys, Wikiprojects, etc. The research carried out by this project would only be a part of the page.

2. Extending Database/Datasets with New Types of Diversity edit

We propose to collect data on all types of diversity that have been identified by the Wikimedia 2030 Diversity WG but are not included in the current Wikimedia Diversity datasets, analyses, and visualizations. We identify five that have greater importance:

  • LGTB,
  • Contemporary Ethnic groups (e.g. Romani),
  • Indigenous peoples,
  • Religious groups,
  • Time (as in century/age).

All these are types of diversity that are growing in the current moment and have dedicated events in the Movement (e.g. the Wikimedia LGTB+ conference, the International Roma day, among others).

Updating the current databases and datasets with these new types of diversity (and gender, which has already been partially incorporated) extends their possibilities in different ways. For example, it allows creating more data visualizations to explain the gaps and tools to provide points of action to address them.

3. New Tools and Dashboards (Visualize & Bridge the Gaps) edit

3.1 Wikipedia content gaps (article level) edit

  • LGTB Gap and Top LGTB Articles dashboards. One dashboard to show the percentage of biographies that can be considered LGTB. Another dashboard that shows the most relevant LGTB articles by local context in line and extending the Top CCC articles lists that show relevant articles from every local context.

The purpose of these dashboards would be to extend the same framework used for cultural articles and provide statistics and tools to bridge the gaps. In fact, it would also be the first analysis of LGTB related articles. There is also interest in making it across languages.

  • Ethnic Groups and Indigenous People dashboards. The previous analyses could be applied to these two categories of human groups and show both its extent, coverage, and spread across languages. These dashboards would also provide lists of relevant articles of these topics from every geographical context and global and to edit every Wikipedia. This would be an important point of action that would help any offline/online event working on the topics.
  • Recent changes and recently created articles related to diversity groups dashboard. One dashboard that shows the list of articles with recent changes (edits) and the list of articles recently created and their potential belonging to diversity groups (cultures, countries, gender, etc.) mapped to different colors to easily recognize them.

3.2 Point of view gaps (in-article level) edit

It would be valuable to create a series of dashboards in order to give information on the gaps at the in-article level, i.e. the missing perspectives for better diversity.

  • Gender and LGTB article content biases dashboards. One dashboard in order to compare the percentage of outlinks to Gender and LGTB topics for any group of articles in order to see the biases, in other words, to detect the missing perspectives.

The purpose of this dashboard would be to allow comparing percentages across different languages and see whose articles are more biased and miss more perspectives by having fewer links to Gender and LGTB topics (i.e. are less diverse).

4. Maintenance and Optimization edit

It is necessary to allocate some time to optimize some of the code in order to improve both the data acquisition and the visualization/tools functioning. It is also important to consider that there may be technical issues that are out of control as they relate to the WMF data resources (e.g. database connections, dumps input/output reading, etc.)

Budget edit

1. Centralizing Diversity Projects and Resources

(approx. 40 hours)

2. Data for New Types of Diversity

(approx. 250 hours)

3. New Tools and dashboards

3.1 Wikipedia content gaps (article level)

* LGTB Gap and Top LGTB Articles measurements and dashboards.

(approx. 120 hours.)

* Ethnic Groups and Indigenous People measurements and dashboards.

(approx. 100 hours.)

3.2 Point of view gaps (in-article level)

* Gender and LGTB article content biases measurements and dashboards.

(approx. 120 hours.)

4. Maintenance, Optimization, Upgrading Tools (Performance to Usability)

(approx. 200 hours.)

Total: Estimated 830 hours


To do in the months from June to October. (June, July, August, September, October)