Community Wishlist Survey 2023/Larger suggestions/Create a large language model that aligns with the Wikimedia movement

Create a large language model that aligns with the Wikimedia movement

  • Problem: ChatGPT and other large language models (better known as AI chatbots) are being developed, which could either disrupt or benefit Wikimedia projects in unexpected ways.
  • Proposed solution: create our own models open-source large language models that serve Wikimedia's mission
  • Who would benefit: Mainly editors by making fighting vandalism and writing content easier
  • More comments: There has been a proposal named Create Wikipedia article stub from Wikidata using ChatGPT that's related to the topic. Some of the potential uses for such a model includes:
    • Brainstorming, identifying possible missing information
    • Detecting more subtle context-dependent vandalism
    • Generate SQL queries to Wikipedia's database without the user needing to know SQL
    • Make templates without needing to know complex wikitext and Lua (probably the easiest to do)
    • Copyediting, identifying prose issues
    • Recommending known sources and further reading resources to a topic (can be done right now with ChatGPT, but the recommendation would be much more effective if the AI is trained on article references)
    • Quickly make stubs on a large scale (Abstract Wikipedia wink wink)
    • and more...
  • Phabricator tickets:
  • Proposer: CactiStaccingCrane (talk) 11:32, 2 February 2023 (UTC)[reply]

Discussion

We can either use existing language models (e.g. GPT) or develop our own model based on an existing model. But creating a whole new one sounds like very complex and difficult. Thanks. SCP-2000 07:50, 3 February 2023 (UTC)[reply]
Have you seen phab:T328494?--Strainu (talk) 20:17, 10 February 2023 (UTC)[reply]

Voting