CIS-A2K/National Digitisation Program

National Digitisation Program


CIS-A2K is committed to support digitisation of reference value content, historical documents, manuscripts, old degenerating material, rare works of art, images, maps etc. and making it accessible on internet under an open license. Partnerships with libraries are being developed to develop the public domain with out-of-copyright works. A relicensing movement has also been initiated to obtain material under an open license after completing the due process with copyright holders. As of today, about 300 works have been released by more than 30 authors/copyright holders and 8 organisations. In the program year 2021-22, the roadmap for launching the India level digitisation program was developed. The summary is presented below -

Digitisation set-ups


CIS-A2K has systematically evolved digitisation activity in the last four years with institutional partners - Vigyan Ashram, Lek Ladki Abhiyan and Pune Nagar Vachan Mandir. Infrastructural support in the form of scanner and laptop is given. Technical training for scanning and post processing is arranged. Regular supervision and quality monitoring is done. The scanning charges are paid to the respective organisation. The material flow is maintained through collaborations with libraries regarding public domain works as well as an individual and organisational relicensing of content. CIS-A2K also provides equipment support to user groups or institutions who take initiatives for digitisation activity.



For any language, there are specific steps which need to be followed before initiating the digitisation activity -

  1. Structured Metadata of authors and their works to decide the copyright - The year of the author's death and the publication details during lifetime & posthumous. The WikiProject Books on Wikidata is an ideal methodology for achieving this across the languages.
  2. Present status of books available in Wikimedia Commons and Wikisource for respective languages is to be charted out to know the gap and the potential. For e.g. in Marathi language, total 825 books are available on Wikimedia Commons and 733 books are available on Marathi Wikisource. The out-of-copyright authors in Marathi as per the database prepared are 200 plus. Therefore around 4000 plus books are in the public domain. Out of these only about 250 books are there on Commons.
  3. Identification of prospective GLAM institutions should be done through mapping exercises. The cities where the population of these is significant could be focused on priority in selected states. There can be other parameters like - their work in digitisation, collaborative nature, views towards free knowledge, sustainability, accessibility etc - may be taken into account.
  4. Identification of potential organisations and partners ready to set-up scanning centres.
  5. Language-wise database of digitised books available on different websites like Internet Archive, NDLI, Hathi Trust etc.
  6. Inventory of digitisation centres established by Govt and Non-Govt organisations in the state to be developed. The lists of scanned books from them would be sought to avoid duplication of efforts. The collaborations with these institutions will result into efficient and effective process.

Status of digitised books on internet

Lang Index Page in Wikisource Books on commons List of copyright free authors available? PD books available on various websites Digitisation potential
As 313 517 Yes. Here (with a few soon-to-be PD names)
Bn 4472 7262
Gu 154 246 link, Gujarati Sahityakosh
Hi 437 252
Kn 345 85
Ml 348 326
Mr 1386 1744 Yes, list of @400 authors is prepared राज्य मराठी विकास संस्था ग्रंथसंग्रह ; महाराष्ट्र राज्य साहित्य आणि संस्कृती मंडळ ; डिजिटल डिक्शनरीज ऑफ साउथ आशिया; विनोबा साहित्य; प्रबोधनकार साहित्य; सावरकर साहित्य; साने गुरुजी साहित्य ; पुणे नगर वाचन मंदिर संग्रह ; गोखले राज्यशास्त्र व अर्थशास्त्र संस्था संग्रह ; विद्या प्रसारक मंडळ ठाणे या ( - 800 books
Or 85 105
Pu 246 23
Sa 20
Ta 2252 41
Te 643 580