GLAM CSI/Wikimania 2024
Global GLAM Meetup
editAt the Global GLAM Meetup on August 12, 2024, more than 30 GLAM professionals and community members gathered to discuss working with cultural and heritage content on Wikimedia projects. The afternoon session was led by Andrew Lih, Jamie Flood, Katarzyna Makowska, and Angie Cervellera at the Silesian Museum in Katowice, Poland.
The afternoon consisted of two activities that engaged participants in generating feedback on the current state of Wikimedia contribution, and brainstorming possible ways of overcoming obstacles.
Presentation of GLAM CSI results
editThe afternoon started with Andrew Lih giving a quick look at the preliminary results of the GLAM CSI survey that was performed in the first half of 2024. (Google Slides link)
Some key takeaways that were presented included:
- For survey respondents, Wikimedia Commons and Wikidata were the most engaged-with Wikimedia projects, with Wikipedia being third.
- When asked "Which tool(s) (or scripts) do you use in contributing to Wikimedia projects?" a mix of both officially supported tools and community (e.g. Magnus Manske) tools were in the mix. The top ranked tools included:
- Wikidata Query Service
- Quickstatements
- Commons Upload Wizard
- OpenRefine
- Program and Events Dashboard
- Cat-a-lot gadget
Feedback on linked open data workflow tools
editWe introduced the Linked Open Data workflow (Wikidata:Linked open data workflow) document to the attendees, which was new to many folks. After quickly explaining the various phases of this classic "Extract-Transform-Load" type workflow, we challenged the participants to respond to the chart with experiences and recommendations.
Participants were asked to individually provide feedback on each of these six phases of the workflow using sticky notes. People were given 20-30 minutes to walk amongst the posters in the room, noting to also read the stickies left by others. They were asked to leave notes in three areas for each column:
- Edits or additions to the list of tools
- Positive experiences
- Challenges
An extra poster to hold any "Other workflow ideas" was also provided.
Some rough overall observations could be made right away:
- Ingestion (3) had the most varied and numerous responses, including many additions of specialized tools for ingestion, such as those related to iNaturalist, BHL, or SourceMD.
- For ingestion solutions, there seem to be a good capacity for creating specialized or custom ingestion (3) tools, whether they were scripts, or custom code.
- A number of feedback notes pertained to using tools to revise, adjust, or clean up data after an initial load of content - VisualFileChange.js, Quickstatements, or Cat-a-lot. This reflects comments that the ingestion/upload tools may not always do all that is desired by the user, requiring multiple tools to add all the relevant metadata, reflecting more of an iterative ELT (extract-load-transform) process rather than a traditional ETL process.
- The most feedback was on the Reporting (6), with many concerns about the reliability and capability of measuring impact of contributions, both within the Wikimedia ecosystem, and especially for help measuring impact externally.
Detailed notes
editTranscribed from sticky notes in images above.
PREPARE | RECONCILE | INGEST | ANALYZE | RE-USE | REPORT | |
Edits/additions | video2commons
Maybe add GenAI for ChatGPT, like for SPARQL queries Opportunity to use AI to make process easier |
Mix'n'Match, I use this to add GLAM identifiers to items all the time
Graph Builder but for properties Opportunity to invest in/use AI to make this process easier Mix'n'match gadget - Magnus mix and match gadget which adds ALL the possible matches onto one of them |
iNat2commons
iNat2wiki SourceMD/Source Metadata BHL2Wikidata Commons mobile app for Android Petscan, ACDC, Cat-a-lot good tools to update institutional upload |
Scholia
Amazing visualizations - also of WikiProjects author_strings (Magnus script) to change author name string to author while looking at the publication item |
WLM and WLE tools | PetScan - Can tell you which of your images have FA Awards, featured images.
Please make reporting super simple so a non-Wikimedian can casually check the stats Reporting impact, for Wikiprojects, not just GLAMs |
Positives | OpenRefine provides easy experience of mass editing/normalization of data (using Python?)
OpenRefine is a very appreciated data tool I LOVE OpenRefine - Works very well with librarians, not so well with museums, WHY? Flickr2commons flickypedia flickr is a platform GLAMs know and license change is relatively simple! I found this useful with small scale project OPENREFINE is introduced at my organization as a wiki-tool but became a very appreciated data-tool It's great for very data literate people... (a minority) |
Petscan helped to find new articles in important themes
I Love OpenRefine, it's not just for wikimedia projects. Reconcile and interact with other websites, seen as pro-level Positive OpenRefine The entire NZ Thesis project runs off OpenRefine it's amazing! More identifiers gadget - empowers missing identifers to be added to items iva adding just the VIAF ID Another disambiguator - amazing tool to ensure publication Wikidata items are linked to their author Wikidata items (yes!) |
WikiShootMe
Great for motivating newbies to add images Reveals incorrect Wikidata coordinates Pattypan Very customizable once I had experimented with it Pattypan Great because it offers a lot of guidance Pattypan Still the only bulk-upload tool that's easy to use (compared to OpenRefine) OpenRefine also great because of excellent support through the forum OpenRefine We're starting to use it in GLAM Wiki collabs OpenRefine For uploading in general |
Visual File Change
Saved my life by fixing an error after a bulk upload Visual File Change for fixing my inevitable typos in Commons bulk upload Reasonator Impresses management! Looks much nicer than Wikidata.org Cat-a-lot Very useful and <unreadable> time Wikidata Graph Tool Useful visualization for non Wikidata people (Angryloki) VisualFileChange.js So easy to use and clean thinngs up Integraality is amazing. How else can I see how my Wikidata project contributed over time? Integraality Helped start Wikiproject Manuscripts MediaWiki API To track quality over tuime of target articles Listeria great for building work lists. I use it all the time to draw together my Wikidata work and find notable people for WP ISA <unreadable> early 2024 with the help of new and developing with Wiki<unreadable> Africa |
Template:Art Photo
Multilingual Metadata on Commons Autopopulating Commons templates using SDC/Wikidata is great! Specialized templates in general Love infobox templates and all templates, really Wikidata templates in en.wp infoboxes - they don't always get deleted now! :) I LOVE Wikidata infobox. Especially for visibiltiy in non-English languages |
Image views are a very good way of selling WM engagement to a GLAM
Scholia Tool - This is amazing for dismabiguation also for generating visualization and engagement Metrics to illustrate reach and impact are essential to getting GLAM leaders to support the work Every use in a wiki article is a win! We want to count them! |
Challenges | Format conversion
Opportunity to use AI to make the process easier Getting rights cleared OpenRefine is daunting for newbies and many GLAM staff OpenRefine would be more usable if web based |
Please create a better tutorial for mix'n'match, technically as a manual instruction
Petscan - documentation?? If I get what I need out it's a miracle! Challenges/reconcile - Petscan is great but it is vary hard for me to learn to to use it OpenRefine - Sometimes reconciliation doesn't work and we have no idea why (beacuse it's external service) |
WikiShootMe
needs upgrades, like choosing "instance of" item created. Could be useful for GLAMS! We've used it with popular libraries Wikidata and Commons data models are complex it would be good to have tools with easy to use "wizards" WikiShootMe +options +connect with events? QuickStatements Adding refs or qualifiers to existing items is hard and unpredictible Pattypan Persistent problems with post settings not being cleared when new batches started Commons Upload Wizard Modern version too slow and need long time to upload. <unreadable> is better I wish there was an easy tool to pull images from our museum's API/collections online into Commons. Or any museum/collection QuickStatements how can we have a tool used for 50% of Wikidata edits break so often?! I enrich NZThesis metadata on Wikidata by connecting authors to publications – I use Magnus's ORCIDator + SourceMD tools but they often break :( (Tamsin) BHL to Wikidata Is limited to 3-4 pubs at a time if they cite a lot of papers. But is a good alternative to SourceMD to get papers in. It's a Magnus tool so it breaks often ISA Tool is proven difficult to use for new-comers because they need an account and know what SD is SDC not complete in Pywikibot Lots of SDC models are incomplete/missing community consensus SDC is not added at upload (but as a subsequent edit) |
Distributed game limitations
Obtaining a bot flat is complicated (Pywikibot) |
Graph Commons
I don't know how it works Wikipedia editors unfriendly to Wikidata Translating labels (WD) and captions (Com) Not easy to find things Would like to know more about Wikidata Graph Builder Commons - More clearer license for GLAMs, especially uploading process which is not planned for them (options are confusing) Wikidata is hard to teach! GLAMs are often interested in data and enrichments added to their data and files, but not easy to retrieve this. |
How to know if a tool work for not?
I want to be told when one of my images is on the front page of Wikipedia Measuring how integrated an uploaded data set is with rest of the platform (images or data) Summarize how a data set has been enriched WikiEdu dashboard - still has lots of problems reports everything not just event related There are so many statistics tools! They do slightly different things and it's impossible to remember their names. They're also hard to learn for GLAM partners (Who really need them) Need for normal simple to use tool for cultural institutions to evalute their event impact Anecdotes and examples are also needed for leadership reports, to illustrate a point made in quantitiatve metrics Linked Data Impact - needs to be defined before it can be measured GLAMorous gives inconsistent results Need for some tools to evaluate extra-wiki use of images on Commons GLAM wiki dashboard - GLAMS have to wait very long time to be added Report on impact of re-use outside Wikimedia projects GLAMorgan doesn't notice when an image was added? Assumes it was always in the aritcle? WMF itself must also be aware of the impact of Wikidata and Commons outside Wikipedia Impact and reliability Stats for the community engagement with the new data is missing Extracting/reporting on data enricment beyond the initial platform is hard How can the new Commons impact metrics dataset and APIs be integrated into existing dashboards and tools so they are more stable? Improve dashboards - create subject specified dashboard |
Other workflow comments
editTranscription of the final poster with stickie notes.
- Rights clearance triage helper
- I've done two professional residences and have never seen the LOD workflow before! so frustrating and doubles the need for clearer routes to GLAM wiki
- Program and Events Dashboard - I would like to learn how to use it more
- Extract enrichment to allow for rountripping - is roundtripping beyond re-use and reporting?
- Multilingual – Wikimedia is very good but we have the potential to be great
- What about adding any other resources for communication to this page? e.g. telegram groups, meetups, etc?
- I didn't know about the wikidata LOD workflow page and I don't know a lot of these tools so THANK YOU for sharing this info!!!
- I'd love to see the Commons app used more - lots of potential as it's easy and app format familiar
- What about adding a section for online training/resources for these tools?
- An index of tools and purposes and platforms would be useful, including documentation
- Some tools are very common or part of a platform; others are more hidden (I often only learned of them in real life meetings) Live meetings are the best way to get to know new tools, then good online documentation to use them is necessary
- Wikibase cloud and specialist wikis
- High/steep learning curve for many tools
- A way to see types of licenses at a glance on a category page (we're actively looking for this! details even)
Ideating solutions to obstacles to contribution
editOverview: This late afternoon session focused on ways to solve challenges that were identified in the GLAM CSI survey's "Barriers to Participation." Six major areas were highlighted, and explained to the attendees. Six posters were spread around the room with the prompts, from 1 to 6.
Participants were asked to choose one of the six areas, and to self-organize into groups and to ideate solutions around the question: "What obstacles do you face when contributing to Wikimedia projects with existing tooling?" The six areas from the GLAM CSI survey results that were presented:
1. Technical and tooling issues | 2. Data modeling and metadata challenges | 3. Resource and capacity constraints | 4. Community issues and cultural barriers | 5. Content integration and re-use | 6. Sustainable future |
---|---|---|---|---|---|
Tool reliability
Lack of good documentation Metrics tools Lack of standardization of tools and connections |
Structured Data uncertainty
API usability Metadata management and mapping of data sets |
Time and skill needed to participate
Training and support Lack of documentation and learning resources |
Notability and standards and community norms
Clash with community practices, undoing edits, nominations for deletion Cultural sensitivity and ethical guidance |
Use of content across Wikimedia
Workflow process improvements "Round trip" integration |
Support for future tool development
Road map unclear, hard to plan for GLAM wiki partnerships Relationship of GLAM wiki communities with Wikimedia Foundation |
Six posters with these prompts were setup around the room. People self-selected to one of the poster groups to work on answering the question:
How might we address and improve.... (an area listed above)
IDEO brainstorming method was employed, where people wrote their ideas on a sticky note, raised their hand with it, announced it to the group. A local facilitator placed it on the paper chart. Groups were encouraged to go for quantity, and to generate as many ideas as possible in 15 minutes. Afterwards, the groups were asked to cluster the raw ideas into thematic areas, and to combine overlapping ideas if appropriate. Then people were asked to add a dot for up to FIVE ideas in order to see where the interest levels were.
We then went around the room to share what each group found.
After that, each group was asked to select at least three major areas that stood out, and add it to a priority matrix: high or low priority, versus being hard or easy to do.
Solutions priority matrix
editThis is a summary of the stickies in the diagram above.
Outcomes
editHardest | Harder | Easier | Easiest |
---|---|---|---|
GLAM council or consortium (6)
Language help: more tools for non-Roman alphabet (4) An easy to handle surface for Wikidata queries (1) Automated API refresh of GLAM-submitted data on Wikidata (2) More technical support for Wikibase integrations (2) |
Prioritize those tools that are easiest to use (3)
Impactful improvements to the platforms to make tooling easier (1) Consistent branding (3) Charge GLAMs for tech services (6) Wiki community training in local context (4), truly multilingual (5) Better onboarding, learning, training for partners to help them get started mapping data (2) Layering/weighting data "official" vs contributed (5) |
Start from project -> search tool (3)
Institutional "buddying up" for better understanding (4) Training folks in tech documentation (6) International GLAM events (3, 4) Recognize low hanging fruit (tools and projects) (3) Support tool developers when there are major changes (1) Step by step recipes for institutions with entry level tasks, 'cookbook' (3) case studies (3) |
Examples (case studies) of impactful reuse (5)
Ways to encourage reuse (5) GLAM Wish List (6) Systematic tools assessment system by quality (status) and importance (usefulness) (1) Understand the future of Structured Data on Commons in the movement (2) |
Legend: Languages and i18n • Training and documentation • Tools and development
Ideas in the upper-right hand (EASY) quadrant are ideal for implementation, as they are easy, high priority tasks:
- Examples (case studies) of impactful reuse (5)
- Ways to encourage reuse (5)
- GLAM Wish List (6)
- Systematic tools assessment system by quality (status) and importance (usefulness) (1)
- Training in tech documentation (6)
- Understand the future of Structured Data on Commons in the movement (2)
In the middle-right area:
- Start from project -> search tool (3)
- Institutional "buddying up" for better understanding (4)
- Training folks in tech documentation (including reporting bugs, submitting improvement requests, <unknown>, <unknown>, <unnknown>, etc) (6)
- International GLAM events (3, 4)
- Recognize low hanging fruit (tools and projects) (3)
- Support tool developers when there are major changes (1)
- Step by step recipes for institutions with entry level tasks, 'cookbook' (3)
In the middle area:
- Build tech capacity, not tools (6)
- Case studies - What tool is most relevant to reach my goal (3)
- Truly multi-lingual (5)
In the middle-left area:
- Prioritize those tools that area easiest to use (3)
- Impactful improvements to the platforms to make tooling easier (1)
- Consistent branding (3)
- Wider definition of GLAM and image flagging for use (5)
- Charge GLAMs for tech services (6)
- Wiki community training in local context (4)
- Better onboarding, learning, training for partners to help them get started mapping data (2)
- Layering/weighting data "official" vs contributed (5)
In the left (HARD) area:
- GLAM council or consortium (6)
- Language help: more tools for non-Roman alphabet (4)
- An easy to handle surface for Wikidata queries (1)
- Automated API refresh of GLAM-submitted data on Wikidata (2)
- More technical support for Wikibase integrations (2)
Main session - Workshop on Documenting Wiki User Stories
editThe main Wikimania session had roughly 20-25 attendees, and challenged audience members to write their own user stories.
Story templates were given on a Google Docs format that people could customize. Among those started in the session include:
- Kamila Neuman - User story for starting GLAM-Wiki Collaboration
- Heidi Meudt - User story for writing one plant species Wikipedia article per week this year
- Siobhan Leachman - User story of researching women scientific illustrators
- Jamie Flood - User story for Wikimedian in Residence at the U.S. National Agricultural Library
- User story for creating a GLAM resources landing page
- Sabine von Mering - User story to create an index of collection agents affiliated with a natural history museum [needs some more work]
Session: https://wikimania.eventyay.com/2024/talk/UBRGFS/