Community Wishlist Survey 2019/Categories/Improvements of Categories in Chinese Wikipedia

Improvements of Categories in Chinese Wikipedia

  • Problem: Categories are sorted by unicode encoding. It became difficult for readers to locate a page in a large category because there are tens of thousands of characters(in Chinese) rather than 26 to sort from. People use pinyin (romanized Chinese) (or zhuyin), radicals or strokes of Chinese characters when they want to find a word.
  • Who would benefit: Users at Chinese Wikipedias(zhwp, wuuwp, etc.) or in logographic language.
  • Proposed solution: I wish that there will be some options (default, pinyin, zhuyin, radicals, strokes, etc.) for users to select in categories.
  • More comments: Japanese Wikipedia users may also benefit from this proposal since kanji (characters) can be sorted by their pronunciation.


  • Well, I'm kind of support sorting by pronounciation, because some Chinese characters are simplified in Mainland China, Malaysia and Singapore but traditional in Taiwan, Hong Kong and Macau. In my opinion, we should sort by Hanyu Pinyin and Zhuyin (Bopomofo) in Mandarin Wikipedia (w:zh:) and Jyutping in Cantonese Wikipedia (w:zh-yue:). As for other varieties, we will discuss later. For Japanese Wikipedia, there seems to be only one sorting method: Gojūon. --Super Wang on zhwiki (Share your opinions) 05:48, 1 November 2018 (UTC)Reply[reply]
  • On the one hand, the sorting of words is not what everybody wants, and the classification is simply confused. --夢蝶葬花@生涯不敗 06:04, 1 November 2018 (UTC)Reply[reply]
  • Is this T46667? Anomie (talk) 12:16, 1 November 2018 (UTC)Reply[reply]
  • It would probably need to support multiple different sorting methods, as some of the most popular methods of sorting Chinese characters like pinyin and zhuyin are limited to particular regions and are almost useless to me. C933103 (talk) 17:57, 1 November 2018 (UTC)Reply[reply]
    • It can be something like the language variant converter we had on zhwiki. --Cohaf (talk) 18:11, 1 November 2018 (UTC)Reply[reply]
    • Zhwp uses standard Chinese to write. The converter can alter them from one standard to another (zh-hans, zh-hant, etc.), thus the pronunciation should be standard, too, though there may be several standards. We can refer to dictionaries for the specific standard, like Xinhua Zidian (Xinhua dictionary) for pinyin. Cantonese romanization may apply better in zh-yue.wp. --Leiem (talk) 01:55, 2 November 2018 (UTC)Reply[reply]
      • I am not familiar with all pronunciation-based sorting, including Cantonese-pronunciation-based, even if that is my native language. I would prefer stroke-based or radical-based sorting method. C933103 (talk) 17:48, 2 November 2018 (UTC)Reply[reply]
        • I made an illustration. Please see the image on the right. --Leiem (talk) 05:22, 3 November 2018 (UTC)Reply[reply]
          • It seems good although there are probably little need to merge pinyin/zhuyin in this image, people could pick a/b/c/d or b/p/m/f order as they like irrespective of users geography origin instead. The categorization system however would need to cater for characters that have different reading in Taiwan vs mainland China Mandarin Chinese when you talk about geographical variant. C933103 (talk) 11:19, 4 November 2018 (UTC)Reply[reply]
  • see phab:T46667--Shizhao (talk) 08:36, 5 November 2018 (UTC)Reply[reply]
  • Note this is similar to Multiple collations per site, but not the same. That one is about specifying the collation per category, while this is about all categories on a site having multiple collations. Anomie (talk) 16:14, 11 November 2018 (UTC)Reply[reply]