CIS-A2K/National Digitisation Program

National Digitisation Program edit

CIS-A2K is committed to support digitisation of reference value content, historical documents, manuscripts, old degenerating material, rare works of art, images, maps etc. and making it accessible on internet under the free license. To access the public domain or copyright free works, the partnerships with libraries are being developed. The relicensing movement is also initiated to get the material under free license after completing the due process by the respective copyright holders. As on today, about 300 works are released by more than 30 authors/copyright holders and 8 organisations. In the program year 2021-22, the roadmap for launching the India level digitisation program is developed. The summary is presented below -

Digitisation set-ups edit

CIS-A2K has systematically evolved digitisation activity in the last four years with institutional partners - Vigyan Ashram, Lek Ladki Abhiyan and Pune Nagar Vachan Mandir. The infrastructure support in the form of scanner and laptop is given. The technical training of scanning and post processing is arranged. Regular supervision and quality monitoring is done. The scanning charges are paid to the respective organisation. The material flow is maintained through collaborations with libraries regarding public domain works as well as an individual and organisational relicensing of content. CIS-A2K also provides equipment support to User groups or institutions who take initiatives for digitisation activity.

Process edit

For any language, there are specific steps which need to be followed before initiating the digitisation activity -

  1. Structured Metadata of authors and their works to decide the copyright - The death year of the author and the publication details during lifetime & posthumous. The WikiProject Books on Wikidata is an ideal methodology for achieving this across the languages.
  2. Present status of books available in Wikimedia Commons and Wikisource for respective languages is to be charted out to know the gap and the potential. For e.g. in Marathi language, total 825 books are available on Wikimedia Commons and 733 books are available on Marathi Wikisource. The copyright free authors in Marathi as per the database prepared are 200 plus. Therefore around 4000 plus books are in the public domain. Out of these only about 250 books are there on Commons.
  3. Identification of prospective GLAM institutions should be done through mapping exercises. The cities where the population of these is significant could be focused on priority in selected states. There can be other parameters like - their work in digitisation, collaborative nature, views towards free knowledge, sustainability, accessibility etc - may be taken into account.
  4. Identification of potential organisations and partners ready to set-up scanning centres.
  5. Language-wise database of digitised books available on different websites like Internet Archive, NDLI, Hathi Trust etc.
  6. Inventory of digitisation centres established by Govt and Non-Govt organisations in the state to be developed. The lists of scanned books from them would be sought to avoid duplication of efforts. The collaborations with these institutions will result into efficient and effective process.

Status of digitised books on internet edit

Lang Index Page in Wikisource Books on commons List of copyright free authors available? PD books available on various websites Digitisation potential
As 313 517 Yes. Here (with a few soon-to-be PD names)
Bn 4472 6696
Gu 154 246 link, Gujarati Sahityakosh
Hi 437 194
Kn 345 85
Ml 348 326
Mr 1386 1648 Yes, list of @400 authors is prepared राज्य मराठी विकास संस्था ग्रंथसंग्रह ; महाराष्ट्र राज्य साहित्य आणि संस्कृती मंडळ ; डिजिटल डिक्शनरीज ऑफ साउथ आशिया; विनोबा साहित्य; प्रबोधनकार साहित्य; सावरकर साहित्य; साने गुरुजी साहित्य ; पुणे नगर वाचन मंदिर संग्रह ; गोखले राज्यशास्त्र व अर्थशास्त्र संस्था संग्रह ; विद्या प्रसारक मंडळ ठाणे या (http://dspace.vpmthane.org:8080/jspui/handle/123456789/3341) - 800 books
Or 85 172
Pu 246 6
Sa 20
Ta 2252 41
Te 643 546