User:Ijon/Content gap

I was curious about the content gap in articles about men vs. articles about women, so I pulled up some data through Wikidata.

Big caveat: only entities that -- 1. have Wikidata items; 2. are known to Wikidata as humans; 3. have either 'male' or 'female' as their gender -- are counted here. This undoubtedly leaves out quite a few people (especially people with no interwikis, where a Wikidata item may never have been created), so read the numbers as at least X, but the proportion should be statistically valid. (I guesstimate the margin of error introduced by these shortcomings to be generally under 10%, and lower on the larger wikis.)

Data as of June 15th 2016 edit

Language # of articles about men # of articles about women % of articles about women out of all humans
ne 1372 1306 ~48.8%
ur 5026 2689 ~34.9%
ko 44062 13321 ~23.2%
te 2720 793 ~22.6%
ja 181416 50902 ~21.9%
pa 3186 874 ~21.5%
no 102911 27090 ~20.8%
or 759 197 ~20.6%
sv 149318 37349 ~20.0%
ml 6643 1653 ~19.9%
vi 17640 3854 ~17.9%
ms 9470 2023 ~17.6%
sr 25446 5488 ~17.7%
fi 103570 22017 ~17.5%
es 234635 49551 ~17.4%
zh 99140 20832 ~17.3%
hi 8534 1756 ~17.1%
et 30567 6297 ~17.1%
id 35432 7283 ~17.0%
he 47988 9727 ~16.8%
kn 1975 397 ~16.7%
pt 151340 30058 ~16.5%
en 1147981 224258 ~16.3%
nl 154224 29931 ~16.2%
bg 46162 8682 ~15.8%
fr 398382 74074 ~15.6%
tr 51804 9539 ~15.6%
pl 243590 44358 ~15.4%
bn 7880 1420 ~15.2%
ar 76379 13617 ~15.1%
de 520916 93043 ~15.1%
it 259088 44159 ~14.6%
lv 13719 2243 ~14.1%
mr 8784 1413 ~13.9%
gu 1059 171 ~13.9%
ru 278018 44477 ~13.8%
hy 20644 3271 ~13.6%
uk 97021 15076 ~13.4%
hu 69788 10658 ~13.2%
ca 97236 14511 ~13.0%
ta 13668 1997 ~12.7%

Data as of February 2020 edit

Language # of articles about men # of articles about women % of women out of total Change in points since June 2016
Estonian 38316 9252 19.45% 2.35
Nepali 2618 1818 40.98% -7.82
Hebrew 63286 15581 19.76% 2.96
Urdu 9906 3762 27.52% -7.38
Korean 72484 23476 24.46% 1.26
Telugu 4204 1463 25.82% 3.22
Japanese 242323 68042 21.92% 0.02
Punjabi 5604 3509 38.51% 17.01
Norwegian 125785 38715 23.53% 2.73
Odia 1780 944 34.65% 14.05
Swedish 180137 48563 21.23% 1.23
Malayalam 9901 4506 31.28% 11.38
Vietnamese 42694 14597 25.48% 7.58
Serbian 35410 8303 18.99% 1.29
Malaysian 21495 4302 16.68% -0.92
Finnish 120535 29537 19.68% 2.18
Spanish 306173 80719 20.86% 3.46
Chinese 138903 32283 18.86% 1.56
Hindi 14120 4133 22.64% 5.54
Indonesian 60219 13785 18.63% 1.63
Kannada 2955 924 23.82% 7.12
Portuguese 185018 41030 18.15% 1.65
Dutch 180179 38152 17.47% 1.27
Bulgarian 61227 12263 16.69% 0.89
French 481537 106250 18.08% 2.48
Turkish 68678 15812 18.71% 3.11
Polish 293359 56357 16.12% 0.72
Arabic 378628 70299 15.66% 0.56
Bangla 17660 4847 21.54% 6.34
German 628444 119123 15.93% 0.83
Italian 329316 61352 15.70% 1.10
Marathi 10410 2220 17.58% 3.68
Gujarati 1470 289 16.43% 2.53
Armenian 34109 8676 20.28% 6.68
Russian 369326 63749 14.72% 0.92
Ukrainian 150223 28070 15.74% 2.34
Hungarian 91180 15933 14.87% 1.67
Catalan 129873 26986 17.20% 4.20
Tamil 19105 3832 16.71% 4.01
Georgian 8266 1815 18.00% no data for 2016
Latvian 19175 4161 17.83% 3.73
English 1376012 306677 18.23% 1.93

See Also edit