GLAM CSI/Midpoint report

For 2024, the Wikimedia Foundation has funded the GLAM Wiki CSI (Contribution Study Initiative), a project to assess the contribution pipeline in the Wikimedia technical infrastructure for supporting cultural and heritage partnerships and projects. It is being administered by the Smithsonian Institution to analyze not just its own Wikimedia efforts, but those of GLAM partnerships around the world. This page contains the midpoint report of this grant, for July 2024.

GLAM Wiki CSI - Contribution Study Initiative

Announcements and results have been relayed in the GLAM Newsletter:

Achievements

edit
  •  
    At the Wikimedia Hackathon session for GLAM CSI, Susanna Ånäs of AvoinGLAM of Finland discussing GLAM issues. (May 5, 2024)
    Q1 2024: GLAM CSI was able to engage in the survey design with WMF and community stakeholders, using Limesurvey on the Wikimedia Foundation platform. Limesurvey was chosen to address issues related to issues the community has had with commercial products such as Google Forms that may not protect privacy of respondents. Outreach was done with multiple channels including Telegram channels for GLAM and technical contributors (GLAM-Wiki Global, Wikimedia Tool Sustainability, Wikimedia Hackathon, Wikimedia Commons Community, and Wikidata), mailing lists for relevant GLAM and cultural partnerships, as well as individual outreach to groups such as the European GLAM Wiki Coordinators. Individual outreach was done with regions via WALRUS, ESEAP, WikiInAfrica, CEE, and South American groups such as Wiki Movimento Brasil. User groups such as Wikimedians in Residence Exchange Network (WREN) and Wiki World Heritage User Group were instrumental in identifying key partners and survey respondents. Internal Smithsonian Institution constituents were reached via affinity groups: Wikibase, Digital Allies, and Wiki Allies.
  • Q2 2024: Survey results were received and with additional push for respondents, responses exceeded n=50. These were able to inform the creation of a priority list of user stories, that spanned a variety of contribution types, and geographies. Andrew Lih was able to travel to the Wikimedia Hackathon with Olga Tichonova of the WMF to conduct user interviews with key GLAM wiki contributors and stakeholders, including those from Brazil, Finland, Sweden, Estonia, France, Germany, Mexico, Nigeria, Singapore, Spain, United Kingdom, and Ukraine. The user stories identified so far include the following, with first drafts available for public review:

Survey results

edit

The initial analysis of the survey results has focused on identifying the key Wikimedia projects and tools that contributors rely on for their work.

Additionally, we honed in on two response questions and analyzed the motivations for contribution and barriers to participation. The free response fields were analyzed using ChatGPT-4o to produce a rough set of aggregated observations, which were edited and refined.

Projects for contribution

edit

Which projects does your organization contribute to? Select all that apply.

Respondents (n=53) were able to select any number of projects from the list supplied, resulting in 186 responses.

  • On average respondents specified 3 to 4 different projects, showing the diverse nature of contributions.
  • Wikimedia Commons was mentioned the most, followed by Wikidata.
  • Wikipedia being the most popular project by traffic, ranked third.
  • Wikisource was mentioned by more than a dozen respondents.
 
GLAM CSI survey respondents - Which projects does your organization contribute to? Select all that apply.

Tools

edit

Which tool(s) (or scripts) do you use in contributing to Wikimedia projects?

A reference page of tools can be found at: Wikidata:Linked_open_data_workflow

Respondents (n=53) were able to select any number of projects from the list supplied, resulting in 431 responses.

An average of 8.1 tools were mentioned for each respondent, showing a significant range of capabilities.

  • The most used tools include a mix of official MediaWiki capabilities (Upload Wizard, WDQS) as well as community-developed tools (QuickStatements, Cat-a-lot).
  • OpenRefine despite being mentioned as complex, is used by more than half of the respondents (56.6%)
  • As an external dedicated tool, the Program and Events dashboard had a significant response (45.3%)
  • WikiCommons Query Service was mentioned by roughly 1/4 of respondents.
  • A mobile-oriented tool, WikiShootMe, had prominent mention (17%)
 
GLAM CSI survey respondents - Which tool(s) (or scripts) do you use in contributing to Wikimedia projects? A reference page of tools can be found at: Wikidata:Linked_open_data_workflow

Motivations for contribution

edit

Why does your organization contribute to Wikimedia projects? Please describe your organization’s desired outcomes by contributing to Wikimedia projects.

Key Themes

edit
  1. Outreach and Public Access: Reach a broader audience by leveraging Wikimedia's extensive user base. Smithsonian Institution and USDA use Wikipedia to extend access to their collections and research, aiming for greater public engagement and visibility.
  2. Content Accessibility and Preservation: Organizations like the BHL-Wiki Working Group and Science History Institute focus on converting legacy data into usable formats and expanding global access to knowledge. Wikimedia projects serve as platforms for preserving and disseminating valuable information.
  3. Educational and Cultural Impact: Entities like the Smithsonian Institution and Auckland Museum use Wikimedia to educate the public and enhance understanding of local and minority histories. They support initiatives like Women in Red to address gender gaps and promote underrepresented narratives that align with their strategic or short-term project goals.
  4. Collaboration and Community Building: Groups such as Wikimedia affiliates in New Zealand and Canada emphasize building supportive relationships with local institutions and fostering community engagement. They provide support to contributors and encourage collaboration across sectors.
  5. Open Science and Data Sharing: Organizations engage in projects related to open citizen science and data sharing. Examples include Open Citizen Science, iNaturalist, OpenStreetMap or projects and various efforts to integrate open data into Wikidata, facilitating research and reuse of information.
  6. Skill Development and Institutional Benefits: Contributions to Wikimedia projects help staff and volunteers gain useful skills and knowledge. For instance, contributions from GLAM institutions can lead to improved data practices and institutional workflows.
  7. Promoting Diversity and Inclusion: Efforts to promote diversity and inclusion, such as acknowledging indigenous knowledge in New Zealand or improving the representation of minority figures. These initiatives aim to create a more equitable and comprehensive repository of knowledge.
  8. Longevity of Wikimedia as a Platform: Wikimedia's infrastructure is seen as a valuable tool for ensuring the longevity and findability of content. Organizations like the New York Public Library use Wikidata and Wikimedia Commons to enhance the discoverability and interconnectedness of their collections.

Barriers to participation

edit

What obstacles do you face when contributing to Wikimedia projects with existing tooling?

A. Technical and tooling issues

  • Tool Reliability and Maintenance: Many contributors face challenges with tools that are not well-maintained or supported, such as Pattypan, BaGLAMa, and ISA Tool. These tools can break without notice, affecting the ability to contribute reliably and effectively.
  • Complexity and Documentation: Tools like OpenRefine and Commons Upload Wizard are complex, with inadequate documentation and tutorials. This creates a steep learning curve for new users and can discourage contributions.
  • Metrics and Analytics: Unreliable metrics tools impact the ability to track the impact of contributions. This is critical for justifying funding and demonstrating the value of Wikimedia projects to stakeholders for long-term engagement.
  • Lack of Standardization: Tools often lack uniform interfaces and connectivity, making them difficult to use consistently. There is also a need for better integration and user experience enhancements across different tools.

B. Data and metadata challenges

  • Structured Data and API Issues: Contributors face obstacles in adding structured data during uploads, managing data quality, and handling large datasets. Requests for API improvements and better support for structured data are common.
  • Metadata Management: Mapping fields between GLAM catalog/collections data and Wikimedia Commons categories is challenging. Contributors often need to develop custom solutions to handle metadata and categorization. Templates for Commons uploads can be confusing to work with.

C. Resource and capacity constraints

  • Time and Skills: Many respondents mention a lack of time to explore and learn new tools. There is a desire for simpler tools that do not require advanced technical skills or programming knowledge.
  • Training and Support: The absence of reliable training resources and support mechanisms makes it difficult for organizations to train staff and maintain contributions effectively.
  • Creating durable relationships and cooperative relationships: Possible revival of previous structures such as the GLAM Wiki US Consortium, or learn from how consortia like IIIF operate

D. Community and cultural barriers

  • Notability and Standards: Understanding and adhering to Wikimedia community standards for notability can be challenging. Wikipedia articles may be deleted if they do not meet these standards, discouraging contributors.
  • Cultural Sensitivity: There is a need for better systems to acknowledge and manage indigenous knowledge and decolonization efforts. This includes respecting the source and consent of indigenous groups.
  • Ethical and Institutional Collaboration: There are gaps in ethical guidance, project registration, and pathways for institutional collaboration. Contributors seek more structured and reliable support systems.

E. Specific tool-related issues

  • Pattypan and OpenRefine: Both of these tools face reliability issues, with users reporting difficulties in using them for uploading data and images. Pattypan's discontinuation has left a gap in easy-to-use bulk upload tools. OpenRefine is an essential tool for many, though there is a steep learning curve.
  • Wikidata and Query Services: Contributors report timeouts and performance issues with query services. The unclear future of WikiCommons Query Service is a hindrance to more tool building for structured data on Commons.
  • Mobile and Usability Issues: Tools like the Commons mobile app are not intuitive for newcomers, limiting their usefulness in training new contributors.

F. Project-specific challenges

  • Integration and Visibility: Finding and using tools that track the use and impact of contributed content across multiple language versions of Wikipedia is difficult. Custom coding is often required for specific tracking needs.
  • Workflow and Process Improvements: The process for bulk uploads and data synchronization is cumbersome, requiring better support and tools to manage contributions efficiently. There are also few tools of this nature that are accessible to non-programmers.

Prototype

edit

The GLAM CSI project originally specified a Wikimedia upload facility prototype based on Smithsonian Institution object data that could feed a system such as DPLA's upload pipeline. While this is still possible, we have redirected efforts into upgrading the existing Wiki API Connector prototype (Github repository). The user interface for this prototype that is based on Wiki API Connector can also feed a DPLA pipeline, but given DPLA's current status, it's best to put efforts into an upload facility that is currently operating.

Future work

edit

The following activities are being planned for the second part of the grant period:

  • Engagement and consultation with community regarding the results with online sessions, in person engagement, and email/chat channels.
  • Further development of user stories related to Wikibase, ISA tool, and Wikidata usage.
  • Conferences
    • Wikimania – Accepted session for Wikimania related to GLAM CSI
    • WikiConference North America – Accepted session for the conference to discuss GLAM CSI results and followup actions
    • Museum Computer Network – Session submitted for this museum technology conference related to Smithsonian wiki work and GLAM CSI
  • Further prototype development by developing user interface for Wiki API Connector