Grants:IdeaLab/Djvu text layer editor
status: idea
IdeaLab |
meet more people |
visit more ideas |
join other ideas |
project:
please add a title
idea creator:
project contact:
alex.brollogmail.com
participants:
summary:
Use some of VE features to edit djvu text layer
created on: 12:10, 13 March 2014
Project idea
editWhat is the problem you're trying to solve?
editWikisource makes a large use of OCR text layer, but effectively uses just a little bit of it (naked text). Djvu text layer contains much more information (words, lines, paragraphs, regions, columno, page text coordinates), unluckily better exportable in a lisp-like format or as xml instead of hOCR.
What is your solution?
edit- To test VE or other WYSIWYG simpler html/xml editors for editing text only, saving information wrapped into xml tags;
- to test conversion extraction/upload of text layer into djvu files using a simple web interface.
Ideas for a test tool
editA test could be done with existent tools:
- djvuLibre (running into Tool Labs), and particularly:
- djvutoxml, that extracts internal mapped text of djvu pages as an xml file;
- djvuxmlparser, that loads back modified mapped text into djvu file;
- tinyEditor, to edit xml text with a WYSIWYG comfortable interface (xml tags are hidden, only editable text is shown into any html textarea;
- a little bit of cgi from Tool Labs to manage such a web editing interface.
Project goals
edit- to split proofreading into two steps:
- djvu text editing (saving the result into djvu text layer)
- text formatting
Get involved
editWelcome, brainstormers! Your feedback on this idea is welcome. Please click the "discussion" link at the top of the page to start the conversation and share your thoughts.
See also
edit
Does this idea need funding? Learn more about WMF grantmaking. Or, expand to turn this idea into an Individual Engagement Grant proposal
|
---|
Ready to create the rest of your proposal? Need more help? |