Research:Prioritization of Wikipedia Articles/Recommendation/Suggested Edits

Suggested Edits is a module on the Android app that recommends images to which a caption or tags can be added or Wikidata items to which a description can be added. There are two main pipelines for each type of recommendation: 1) add where images/items are randomly recommended that are relevant to a given language edition but lack the caption/tags/description, and, 2) translate where the same process occurs but using another language as a source of caption/description to translate. The focus below is on the add pipeline as the translate pipeline has less usage and depends on the source and target languages so is not as succinctly evaluated (but my assumption is it follows very similar patterns to the add pipeline).

Python scripts for simulating this pipeline to generate the statistics below and Jupyter notebooks for evaluating the actual edit data can be found here: https://github.com/geohci/wiki-prioritization/tree/master/recommendation_evaluation

A summary of the analyses describe below is that depending largely on a random selection of content for recommendation reinforces the status quo around gender and geography -- i.e. heavy imbalance towards men, the United States, and United Kingdom -- and therefore the net effect of the recommender is to improve content about men more than women or other gender identities and content about the US/UK more than other regions. The exact regions improved depends heavily on language -- i.e. US/UK for English Wikipedia but Japan for Japanese Wikipedia or Germany for German Wikipedia. There is a little bit of evidence that editors also exert geographic selection bias -- i.e. slightly preferring to edit content about some regions over others -- but the effect is not large. For gender, there is no indication that the gender identity associated with the content recommended affects whether editors choose to make an edit or not.

Add Wikidata Description edit

AWD Candidate Generation edit

The add-a-Wikidata-description recommendation pipeline is as follows (percentages are likelihood that a given input at that stage passes that filter criteria and are based on an analysis of 2500 random enwiki candidates):

Across the whole pipeline, 2500 candidate articles would end up being ~450 Wikidata items (18% retained) that can be recommended where most of the filtering happens because the Wikidata item or Wikipedia article already has a description (i.e. via shortdesc or something similar). Of those 2500 Wikidata description candidates, ~720 (29%) of them would be items about people and ~82% of those people would be men. Of the ~450 items that can be recommended, 45 (10%) would be about people (Wikidata items about people are more likely to have existing descriptions, at least on enwiki) and the proportion of men does not seem to shift. Of those candidates, 1400 (56%) would have region information (25% United States; 9% United Kingdom; 5% India; 4% France; 4% Australia; 4% Canada; 3% Germany; 3% Japan; 3% Iran; 2% Italy; 2% Poland; and a continuing long tail). Of the ~450 that can be recommended, we see no large shift in the proportion with region information nor the regions that comprise the list. Translation of the Wikidata description has the same parameters as above plus the requirement that the article/description exist in source language.

AWD Edits edit

Between 18 May 2020 (roll-out of V4 of Suggested Edits) and 30 November 2020 (latest Mediawiki History snapshot), there have been 42,533 edits made by 4,545 editors to 40,260 different items on Wikidata for English-language descriptions. Of these 40,260 items, 40,172 (99.8%) are still in-use on at least one Wikipedia language edition. Of those 40,260 different items, only 6,025 (15.0%) are biographies, which indicates a bias by editors against adding descriptions to articles about people. Of those 40,260 different items, 20,842 (52%) have associated regions and they received 21,945 edits.

The 6,025 items associated with biographies and English Wikipedia edited via Suggested Edits received a total of 6,352 edits. This is very close to 1 edit per item and there is no clear bias with regards to gender or geography, so the rest of this analysis will focus just on items and not how many times they were edited. 4,831 (80.2%) of the items were about men, 1,195 (19.8%) of the items were about women, and 1 (0.0%) item was about a transgender women. These percentages are in-line with the percentage of recommendations associated with each gender identity, suggesting that there is no editor bias towards taking recommendations associated with particular gender identities.

Of the 20,842 items associated with regions, 5520 (26%) were associated with the United States, 1862 (9%) with the United Kingdom, 1647 (8%) with India, 765 (4%) with Canada, 658 (3%) with Australia, 556 (3%) with France, 464 (2%) with Japan, 464 (2%) with Italy, 463 (2%) with Germany, and a long-tail from there. This tracks pretty closely with the percentage of recommendations associated with each region (with perhaps a slight bias towards India but more data needed), suggesting that editors in general do not selectively skip recommendations based on the associated region.

The revert rate showed no significant difference between items about men (245 of 5091 edits; 4.8%) and items about women (64 of 1260 edits; 5.1%). The single edit to an item about a transgender woman was not reverted. The revert rage for regions was even lower (around 3% depending on the region with only items related to Japan standing out as having a 8% revert rate: 38 of 500 edits). Notably, the revert rate in languages outside of English was much lower (generally around 1% except for items associated with Turkey at 5%: 212 of 3990).

The English-language descriptions make up just one-fifth of all the data (40,172 of 197,540 items; 42,533 of 226,047 edits; 6,025 of 45,324 biographies). Notably, biographies are more likely to be edited in other languages (presumably because they are more likely to be lacking in a description than for English). The data on gender of biography and revert rate for other languages shows the same trends. For geography, the language of the description greatly affects the results -- e.g., descriptions added in Japanese are most likely to be about items associated with Japan because Japanese Wikipedia has more content about Japan than the United States.

Add Image Caption edit

AIC Candidate Generation edit

The add-an-image-caption recommendation pipeline is as follows (percentages are likelihood that a given input at that stage passes that filter criteria and are based on an analysis of 2500 random enwiki candidates):

Across the whole pipeline, 2500 random candidate images would be filtered down to ~2375 (95%) images that can be recommended for captions. The 5% that are filtered out are about half non-images (e.g., audio files) and half existing captions. However, for enwiki, only 150 (6%) of these recommended images were currently transcluded in an English Wikipedia article. Of those 150 articles that do transclude one of the images, 35 would be about people and 80% of those are men. Of those 150 articles, 89 would have associated country information (sometimes multiple regions -- e.g., for people born in one country but who lived elsewhere). While the small numbers mean that these proportions would shift, ~20% would be about the United States and ~10% about the United Kingdom. Then another less than 5% would be about Australia, Japan, France, Germany, Canada, Italy, Spain, Norway, and a long-tail of others. Translation of image captions has the same parameters/process as above plus the requirement that a caption exists in the source language.

AIC Edits edit

Between 18 May 2020 (roll-out of V4 of Suggested Edits) and 30 November 2020 (latest Mediawiki History snapshot), there have been 86,397 edits made by 14,116 editors to 68,987 different images on Commons. Of these 68,987 images, 20,717 (30.0%) are in-use on English Wikipedia, which is far greater than the 6% in-use that are served by the recommender (suggesting that users do bias heavily towards taking recommendations for images that are in use on the wikis even without that information explicitly available to them in the app). Of those 20,717 in-use, 7,478 (36.1%) are biographies. This is substantially higher than the 23.3% of in-use images that are associated with biographies in the recommendations, suggesting a further bias to images related to people.

The 7,478 images associated with biographies edited via Suggested Edits received a total of 10,640 edits. The images were associated with 6,312 (84.4%) articles about men, 1,158 (15.5%) articles about women, 4 (0.1%) articles about non-binary individuals, and 4 (0.1%) about transgender women. For the 10,640 edits, 8,972 (84.3%) went to images associated with articles about men, 1,659 (15.6%) went to images associated with articles about women, 4 (0.0%) went to images associated with articles about non-binary individuals, and 5 (0.0%) went to images associated with articles about transgender women. These percentages are in-line with the percentage of recommendations associated with each gender identity, suggesting that there is no editor bias towards taking recommendations associated with particular gender identities.

Additionally, 13,894 images associated with articles with region information received a total of 19,507 edits. The images were associated with the following regions: United States (30% of images; 30% of edits), India (12% of images; 14% of edits), United Kingdom (12% of images; 13% of edits), France (10% of images; 11% of edits), Germany (8% of images; 9% of edits), Italy (7% of images; 8% of edits), Canada (5% of images; 6% of edits), Russia (5% of images; 6% of edits), Spain (5% of images; 6% of edits), Japan (4% of images; 5% of edits), and a long tail thereafter. The consistently slightly higher percentages for edits than images seems to come from images that are associated with more regions being edited more often. This general ordering and proportions for each region is roughly in line with the likely recommendations with perhaps a slight bias towards the United States and India but more data would be required to establish that.

The revert rate showed a small but insignificant difference (95% CI) between images associated with articles about men (555 of 8972; 6.2%) and articles associated with images of women (117 of 1659; 7.1%). None of the 9 edits to images associated with articles of non-binary or transgender individuals were reverted. There was no clear pattern in revert rate by region but it does range from 5% (Italy; France) to 10% (South Korea; Argentina) with the United States at 9%.