Grants:Programs/Wikimedia Community Fund/Acquisition of missing pages and books of Nationalised books, Wikisource workshops and a GLAM activity in TamilNadu/Report


Sources edit

Goals edit

This is the project to find out the incomplete pages and unuploaded books of Nationalised books of Tamilnadu in libraries, few GLAM activities at Chennai metropolitan city and workshops in 6 cities of TamilNadu. Tamil Nadu government funded more than ₹12 crores (150,399.31 US Dollars) to release books under public domain which is called as Nationalised books of the 165 authors. In the year 2016, i as a Wikimedian in Residence uploaded those Nationalised 2217 books in Commons. After the five years of contribution at Tamil Wikisource, we found that in 400 books some pages are not found, some pages should be rescaned, few books are fully rescanned. Then we concluded that we should search in the city libraries. thus we are going to solve this multiple issues of the eminent Tamil books for Tamil Wikisource. This project period is six months. With in the period, i am happy for the completion of the project targets and reduced the gender gap as before.

Outcome edit

Please report on your original project targets.


Target outcome Achieved outcome Explanation
Our community members found that 400 books are to be repaired for the past five years as of February 2022. Among 383 books, the exact issues are identified and repaired 183 books after the verification of
97,330 pages.
Half of the target achieved. The remaining books are in process. The process is time consuming because finding exact version of the book is too difficult. Finally, i realised it is not possible by a single contributors and trained few volunteers. Due to their search of exact page/ pages in various TamilNadu city libraries. we found many books that need. Every alternate weekend, i trained few female contributors like Train the Trainer Program (See the local project page for the female contributors.
After the disscussion, we raised two Wikimedia Phabricator ticket T315171 and T316880)
100 Nationalised books are to be uploaded. 251 books are uploaded at Commons after the verification of the 61,724 pages Target achieved as much expected and verified all the pdf pages with relevant paper books at Anna Centenary Library, Connemara Public Library, RMRL and few other city libraries of Tamil Nadu.(Salem & Dindigul). Many times, i found the next version of the repair books which are vary in page structure. So, i borrowed new version books from lending library and also from other friends to scan in my room. Till now 24 books are scanned by me. Because of the partnerships of Tamil Virtual Academy and utsc_tamil,utoronto (library) i uploaded 227 books after the verification at the libraries. (See the local project page) Hence 227 +24 = 251 books uploaded. Balance work:1909 AD dictionary process and scanning is going on. It will consists of 1200 pages. Its pages need to much care. 800 pages completed. 400 pages are in process.
30 screencasts for Tamil wikisource's newbies and intermidiate contributors for easy Linux usage, pdf manipulations. 27 screencasts are created in Tamil language. See the below link c:Category:Rapid Fund SAARC 2022 Tamil Wikisource screencasts Already created many basic screencasts.
(You can find here)
Six wikisource workshops in Tamilnadu cities. Only five workshops done in four Tamilnadu city ladies colleges. two at Chennai, each one at Ulundurpettai, Gobichettipalayam and Coimbatore. Two lady's colleges are accepted for the workshop in upcoming months. After the two years of corona's lockdown, all the college administration focusing their academic syllabus. Eventhough 435 book pages are cleaned up for the OCR spelling error. Conducting Lady's college workshop is difficult. Need patience to approach few time. We have to answer many quetions for the Wikisource workshop. (See the media files and workshop details at the local project page)
Creation of six Audio books 31 Audio books are created
215 minutes /
3.5 hours
Iswarya lenin, one of the female contributors, created 30 audio files and relavant images for the files at commons. In Tamil Wikisource, we started this interesting project by this effort. See the Audio book gallery at Tamil Wikisource. In it, you can find the relavent proofread completed text link below each of a audio book (30 books + 1 book by her outreach.)
Digitalisation of six palm leaf manuscripts with its text. Created eleven digitalised palm leafs and transformed into text which are having Tamil old literary values. You can find those production at the Tamil wikisource. This is the first project for palm leaf manuscripts.
I, info-farmer and Joshua-timothy-J went for 10 days basic training to learn the reading of palm leaf manuscripts.
Partnerships will be appear with two private libraries Two nationalised author's legal heirs are accepted to donate their collection which are nearly 20 to 30 books. (s:ta:ஆசிரியர்:பாவலரேறு பெருஞ்சித்திரனார் & s:ta:ஆசிரியர்:கவிஞர் வெள்ளியங்காட்டான்) for Tamil wikisource. Eventhough they accepted, we have to go to their library to do the book scanning work with good scanner by our own cost. RMRL is one of the best private research library but there is no book lending service. It is a reference library. But you can ask many books as a paid photostate paper copies. For a copy, they are charging three rupees. No pdf book service available.
Tie up with TamilNadu state’s ‘Kanith Thamizl Peravai’ of 20 selected colleges Our state government stopped funding for the Tamil computing clubs. After the 2 years of Corona pandemic time, colleges administration focusing on thier academic activites. I hope this year, 5-10 workshops can be done.
Few GLAM activities at Chennai metropolitan city Done not only in Chennai but in few other cities also. More than 6000 files are uploaded at Commons. Details of the GLAM at the local project page.


Individual workshop outcomes edit

This is not planned in my project but it is the strategy of this project. Usually the workshops are general instructions for many types of wikimedia contributions. After the workshops, few days later, few users convey their specific interest for particular project by phone call or by Telegram app. Then i prepare a video tutorials for them to learn. Due to that few hours of outreach effort, i noticed the following outcomes which are beyand my expectations and the project workflows.

Work ground unexpected outcomes
Serial Individual training theme User name/s The outcome
1. The need and differences of open source licences especially about book cover and the inside images. how to create the 'GO' documents and upload to Commons. Trainee:TVA ARUN
We uploaded 15 Tamilnadu 'GO' documents
('GO' = Govenment Order) related to Nationalised book's 125 authors out of 165 authors till 2021.
To avoid copy right related legal issues in future, these documents will help for the free Wikimedia advocacy.
2. The Wikimedia licences and other open licences differences. Trainee:Gnuanwar
He is interesting to do negotiation with publishing companies. He negotiated with a private publishing company for two herbal books.
c:File:Letter from the private company declaration for their two Tamil pdf books under Creative Commons license in 2022.pdf and also two Tamilnadu nationalised book GOs.
3. College workshop negotiation and outreach for Wikisource Trainee:Mohedeen
20 Books of a nationalised author.
4. In Tamilnadu few opensource movement groups exists. From the groups of some volunteers, a new group viz., Tamil Linux community started website to promote FOSS esp., wikibased techs by Tamil language. Trainee:Tamil Linux community members The following scripts are created by the Tamil Linux Community in python3 for Tamil Wikisource
1) tiff2pdf
2) customised pdf uploader
5. Pronunciation of Tamil words as Audio files for wiktionary and uploaded in Commons. Trainees;-
1) SENTHAMIZHSELVI A (SUL 2760< )
2) நுட்பா (SUL 2600< )
3) Selvi palaniappan (SUL 150< )
Above 5100, Tamil pronunciation audio files are created.
6. Wikisource proofreading workshop for women Trainees:
1) Deepa arul - Encyclopedia's 540 pages (quarry)
2) Rathai palanivelan (poetry books/6500 edits((SUL)
Spelling errors are removed in 5138 pages.
You can see the book list at the local project page
7. Four School childrens are trained to crop images at Tamil Wikisource. You can see their contributions as videos Trainees:Rabiyathul Jesniya (SUL)
Ateefa shafrin (SUL)
Thamizhini Sathiyaraj (SUL)
Pavanar Sathiyaraj(SUL)
Helped to start a new discussion for school wiki contributions.

Learnings edit

  • According to the field study shows that most of the libraries are following 'open access system' and running with less book keepers. So, usally readers misplaced books in unsorted shelves. So finding books in one library is not possible. After theCorona lockdown most of the libraries are under maintenance. Many books are outside the libraries because of book lending service.
  • Computer hardware: For PDF manipulations, 8GB RAM should be there. Otherwise, many time OS will hang.
  • Softwares: There are many open source powerful tools are available. But internet Archive's software is more effective to produce PDF and it will create good and small size files too. We have to analyse, IA's software. Scan tailor is so good for clean up the scanned documents. But it converts all the jpg files as tiff files. If the out is jpg file as input, many command line tools are available than tiff. To clean up inside the scanned image, GIMP is good.
  • Partnership: Individual effotrs are not so good when compare to the results of Partnerships. UTSC Library patnership for Tamil Wiksource gives alarming good results.
  • GLAM : For GLAM activites, mobile phone is not so good. Because, the day light affects the images. If use professional camera, the details of the images are so good.
  • strategy: Ofcourse a project is having certain goal/s. While you working for the goal, if a opportunity comes adapt it. For example, I went for a college's wikisource workshop but there is an electricity failure occoured. Then i changed the workshop for a mobile app namely Spell4Wiki. The result of a app workshop is above 5000 pronunciation of Tamil word audio files which are uoloaded to Commons. The result may be late.
  • Private Libraries:I found few excellent private libraries. If you go there with a scanner, they will allow you. Otherwise they will refuse to give even a book to you. In future, may be two decades after the paper books will ruin. So take some efforts to keep the book for our next generation.

next edit

By this project, i covered only 100 nationalised author's books. But the total number of nationalised authors are 165 as of 2021. Eventhough Tamilnadu government is scanning books, the quality is not so good for many books. If the quality is good, the OCR result will be good. So, we have to analyse the government scanned books which are nearly 25,000 books. The main challenge is its scanning quality and its license. In the Tamil digital library, they are not focusing the nationalised books. I found that the nationalised books are to be scanned. If we coduct collage workshops, the result will be more helpful for the growth of Tamil Wikisource and Data Science.

Finances edit

Work ground expenses
Serial number Item Amount ₹ Explanation
1 Project-related intercity travels 1,800 Every 15 days, I will return to my family.
2 Daily transport during project 3,200
3 City room 10,000 I needs to travel to other cities,
4 Fiber cable internet, phone connectivity 1100 It varies in the working location.
5 Food, refreshments for work places. 12,000
6 On ground work remuneration 20,000 working 45 hrs/week
7 Total expenses for a month 48,100
8 Total expenses for six months (6 X 48,100) 2,88,600
9 Mobile phone for Museum, as scanner at libraries. 20,000 Work equipment
10 Laptop service and new external hard disk 12,000 Work equipment :My laptop needs service ; a external hard disk ₹ 5500
11 Grand total for this project 3,20,600 USD 4260

Remaining funds edit

Do you have any remaining grant funds? No

Anything else edit

Among Indic Wikisource communities, Tamil community is the first community who has completed a lakh of proofread pages. (Refer (click yellow)). Our Tamilnadu government is going to spend Ten crores for public libraries. If Wikimedia officials approach the government, that will give good result for this Wikmedia movement. Already Tamil language development officials accepted orally to give 1000 pdf books and i explained about the wikisource project values (Refer the discussions (20 minutes in Tamil language))