OpenRefine
This page is currently a draft. More information pertaining to this may be available on the talk page. Translation admins: Normally, drafts should not be marked for translation. |
OpenRefine is a free data wrangling tool that can be used to process, manipulate and clean tabular (spreadsheet) data and connect it with knowledge bases ("spreadsheets on steroids" / "a swiss army knife for data"). It is widely used by librarians, in the cultural sector, by journalists and scientists, and is taught in many curricula and workshops around the world.
OpenRefine has been a popular tool for Wikidata batch editing since 2018. The tool also supports Wikimedia Commons (batch editing and uploading) thanks to a Wikimedia grant (2021-22). Thirdly, OpenRefine can be used to batch import and edit data items and media files in Wikibases.
OpenRefine is a community-supported open source project, licensed under the BSD license. It has a graphical user interface in more than 15 languages.
- General links
- OpenRefine's website: https://www.openrefine.org/
- Download OpenRefine: https://openrefine.org/download.html
- General OpenRefine documentation: https://docs.openrefine.org/
- Talk about OpenRefine with its community and with Wikimedia users
- OpenRefine's user mailing list https://groups.google.com/g/openrefine
- Telegram group for Wikimedians who use OpenRefine: https://t.me/+Qc23Jlay6f4wOGQ0
- Bug reports and feature requests
- On GitHub (for OpenRefine in general): https://github.com/OpenRefine/OpenRefine/issues
- On Wikimedia Phabricator (mainly for Wikimedia Commons reconciliation): https://phabricator.wikimedia.org/tag/openrefine/
π OpenRefine for Wikimedians
Cloud version of OpenRefine (on PAWS) for WikimediansEdit
Is it difficult for you to run OpenRefine on your own computer?
Run OpenRefine in PAWS on Wikimediaβs Cloud Services (you need a Wikimedia account and an internet connection): https://hub.paws.wmcloud.org/
OpenRefine for WikidataEdit
- OpenRefine for Wikidata editing: info, tutorials... https://www.wikidata.org/wiki/Wikidata:Tools/OpenRefine
OpenRefine for Wikimedia CommonsEdit
- OpenRefine for Wikimedia Commons editing: info, how-tos... Β https://commons.wikimedia.org/wiki/Commons:OpenRefine
- How-to: add structured data to files on Commons with OpenRefine 3.6+ https://commons.wikimedia.org/wiki/Commons:OpenRefine/Adding_structured_data_with_OpenRefine or as a video: https://www.youtube.com/watch?v=kv8bDtO4cq8
- How-to: upload files to Commons with OpenRefine 3.7+ (under construction / experimental) https://docs.google.com/document/d/19eiMeq3XssiPrT9b04E-8XyE-desBEzYNgygLDYKP4o/edit# or as a video: https://www.youtube.com/watch?v=sc6aNNmsNCI
Related toolsEdit
OpenRefine edits can be undone with the EditGroups tool.
- EditGroups on Wikidata: https://editgroups.toolforge.org/
- EditGroups on Wikimedia Commons: https://editgroups-commons.toolforge.org/ (note: is not able to undo uploads - only edits to existing files.)
π OpenRefine for Wikibase
(links to documentation here)
π€ Reconciliation
- Wikidataβs reconciliation service https://wikidata.reconci.link/
- Wikimedia Commonsβ https://commonsreconcile.toolforge.org/
- Setting up reconciliation services for Wikibases:
- Various services via the Reconciliation Testbench: https://reconciliation-api.github.io/testbench/
TIPSΒ !
- Also search Google / the web for examples, GREL syntax/recipes etc, whenever you want to do something less straightforward! There are A LOT of tutorials and help forums out there.