Talk:Learning patterns/Digitizing archival records

Add topic
Active discussions

Some suggestionsEdit

Hi Harej, I think it might be worthwhile to do some workaround on some of the recommendations.

  • The quality of the scan depends a lot on the type of document you're scanning and on the type of device you're scanning with. For simple, plain text, 300DPI is more than enough to comply with the needs you have inside Wikipedia (such as getting an acceptable OCR), more than that is not necessary and might get the whole process heavier and slower without creating any extra value. 600DPI might be reasonable when you're doing a rare book with illustrations, or a document that has a lot of information, or a poster or another type of document, but definitely not for text. It would be interesting also to create some sort of table on how to calculate DPI when you are not using a table scanner and are using other type of devices, such as the info that is here on how to calculate DPI when using a different technology. In any case, it might be a good idea to establish "minimum quality settings" to ensure that there are different options for different users, based on the resources they have at that moment.
  • This beginning might be a little bit intimidating for projects that aren't as big as the English Wikipedia: "For instance, a collection of over 120,000 records from the U.S. National Archives and Records Administration was uploaded to Wikimedia Commons, where they received over 1 billion views in 2013." I completely understand the point you're trying to make, but if you think about institutions that are smaller, they might not feel that their project is worthed to be inside here because they're not as big as that (which is contradictory with the example deployed --the Allende Archives, a collection of 40 documents, a small collection but a big one at the same time). Might be a good idea to leave that example but also include examples of smaller institutions / smaller collections / smaller countries / smaller groups of people!
  • Ensure acceptable copyright status: maybe it might be a good idea to use some of the existing resources to help determine this in a better way. It might even be a good idea to separate it in a different learning pattern, or be redirected to existing Wikipedia pages where there's a minimnal information on how to get a rights clearence. Just a suggestion!
  • Upload to commons... and then categorize it right and link it ;) . This is a necessary step because even when the user is documenting level metadata correctly, if it's not linked properly it's difficult for people not used with Commons to find the document. Again, just a suggestion.

I'm not editing this directly because I think some of this recommendations might be worthwile to discuss a little bit further, thinking on the possibility of extending this pattern (and translate it and internationalize it). Astinson_(WMF) suggested I started this conversation with you to see if we can work together to expand this pattern. --Scann (talk) 21:36, 29 September 2016 (UTC)

Scann, you are more than welcome to make modifications to the learning pattern. I wrote this not as an archivist (which I'm not) but as someone who has organized Wikipedia editing events. Any actual archival expertise is certainly welcome. harej (talk) 00:09, 4 October 2016 (UTC)

Proposal around "documenting digitization practices in Wikimedia communities"Edit

Hi Harej, I wanted to reach again to ask for your feedback (and if you consider it appropiate, for your endorsement) to this proposal. Any comments will be more than welcome! --Scann (talk) 19:29, 21 November 2018 (UTC)

Return to "Learning patterns/Digitizing archival records" page.