Community Wishlist Survey 2017/Wikisource

Wikisource
9 proposals, 115 contributors



XTools Edit Counter for Wikisource

  • Problem: There are not wikisource specific stats about user wise Proofread/validation
  • Who would benefit: Wikisource Community
  • Proposed solution: Need one tools
  • More comments:

Discussion edit

Voting edit

ProofreadPage extension in alternate namespaces

  • Problem: ProofreadPage elements, such as page numbers, "Source" link in navigation, etc. do not display in namespaces other than mainspace
  • Who would benefit: Wikisources with works in non-mainspace, such as user translations on English Wikisource
  • Proposed solution: Modify the ProofreadPage extension to allow its use in namespaces other than mainspace
  • More comments:

Discussion edit

Voting edit

Extend pag and num accessibility

  • Problem: {{{pag}}} and {{{num}}} are reserved parameters for proofread extension, logically linked to pagelist tag. It would be useful to extend their use, so that they can run anywhere.
  • Who would benefit: wikicode contributors
  • Proposed solution: to allow to pass to {{{pag}}} and {{{num}}} two additional optional data (index name, book page/file page) to get book page by file page and file page by book page dynamically using pagelist data, into any context.
  • More comments:
  • Phabricator tickets:

Discussion edit

Thank you for the proposal. I am not sure to understand what you want to have. Maybe API (maybe in Lua) that provides this kind of functions getPageTitleForFile(fileName, filePageNumber), getPageTitleForIndexAndPage(indexName, logicalPageNumber), getIndexTitleForPage(pageName), and getFilePageNumberForPage(pageName)? Tpt (talk) 10:59, 8 November 2017 (UTC)[reply]
Lua access to all data coming from Index page (all fields, pagelist relate table too) will be great. It.source uses a special Modulo:Dati/[baseIndexName] to save and use these data, see it:Template:Pg that uses data, but it's a local, do-it-yourself solution. --Alex brollo (talk) 14:49, 8 November 2017 (UTC)[reply]

Voting edit

Improve workflow for uploading books to Wikisource

  • Problem:
Uploading books to Wikisource is difficult.
In the current workflow you need to upload the file on Commons, then go to Wikisource and create the Index page (and you need to know the exact URL). :The files need to be DJVU, which has different layers for the scan and the text. This is important for tools like Match & Split (if the file is a PDF, this tool doesn't work).
More importantly, the current workflow (especially for library uploads) includes Internet Archive, and the famous IA-Upload tool. This tool is now fundamental for many libraries and uploaders, but it has several issues.
As Internet Archive stopped creating the DJVU files from his scans, the international community has struggled solving the issue of creating automatically a DJVU for uploading on Commons and then Wikisource.
This has created a situation where libraries love Internet Archive, want to use it, but then get stuck because they don't know how to create a DJVU for Wikisource, and the IA-Upload is bugged and fails often.
Summary
    • IA-Upload tool is bugged and fails often when creating DJVU files.
    • M&S doesn't work with PDF files.
    • Users do not expect to upload to Commons when transferring files from Internet Archive to Wikisource.
    • Upload to Internet Archive is an important feature expecially for GLAMs (ie. libraries).
  • Who would benefit:
    • all Wikisource communities, especially new users
    • new GLAMs (libraries and archives) who at the moment have an hard time coping with the Wiki ecosystem.
  • Proposed solution:
Improve the IA-Upload tool: https://tools.wmflabs.org/ia-upload/commons/init
The tool should be able to create good-quality DJVU from Archive files, and do not fail as often as it does now.
it should also hide, for the end-user, the uploading to Commons phase. The user should be able to upload a file on Internet Archive, and then use the ID of the file to directly create the Index page on Wikisource. We could have an "Advanced mode" that shows all the passages for experienced user, and a "Standard" one that makes things more simple.
  • More comments:

Discussion edit

Voting edit

Page status color code not always showing

  • Problem: Color code indicating the page status on the index page do not always show on French Wikisource. We have to purge the book page many times.
Problem started in mid-late 2016; before it was very rare that we had to purge to see the colors.
  • Who would benefit: This is counter intuitive for beginners. Documentation mentions the page color code but they do not show, and this is very confusing to new contributors. Reduce loss of time when editing a book, especially for advanced contributors.
  • Proposed solution: We should not have to purge the index page each time we display a book
  • More comments:

Discussion edit

  • I strongly endorse this. That's an annoying bug that I was thinking as unique to it.wikisource: if such behaviour is common to more projects it deserves an appropriate solution, and it needs it quickly: an index page is meant to show the state of any pages to let users decide transcribe, proofread or validate them. - εΔω 16:51, 23 November 2017 (UTC)

Voting edit

Improve export of electronic books

  • Problem: Imagine if Wikipedia pages could not display for many days, or would only be available once in a while for many weeks. Imagine if Wikipedia displayed pages with missing informations or scrambled information.
This is what visitors get when they download books from the French Wikisource. Visitors do not read books online in a browser. They want to download them on their reader in epub, mobi or pdf.
The current tool to export books in these formats has all those problems: last spring 2017, it was on and off for over a month; since october 2017, mobi format does not work, then pdf stopped working. I did not publish a book because the electronic formats has different problems. (I have made a list of these problems if required.)
  • Who would benefit: The end users, the visitors to Wikisource, by having access to high quality books. This would improve the credibility of Wikisource.
This export tool is the showcase of Wikisource. Contributors can be patient with system bugs, but visitors won’t be, and won’t come back.
The export tool is as important as the web site is.
  • Proposed solution: We need a professional tool, that runs and is supported 24/7, as the different wikimedia web sites are, by Wikimedia foundation professional developers.
The tool should support different possibilities of electronic book, and the evolution of ebooks technology.
The different bugs should be corrected.
  • More comments: There are not enough people in a small wiki to support and maintain such a tool.
Wikisource should not only be considered a web base platform: the ebooks are as important, and even more important for visitors.

Discussion edit

For some information the current problem (only the last episode of a long and sad road of problem) is phabricator:T178803. The problem is ongoing for almost a month now and we have a lot of complaints from readers. Cdlt, VIGNERON * discut. 13:01, 21 November 2017 (UTC)[reply]

Voting edit

Specify transcription completion with more granularity

  • Problem: Currently Wikisource revision system only allow to give a global status completion for the transcription, when a more flexible solution allowing multiple extensible criteria set would be welcome.
  • Who would benefit: Anybody interesting in having having fine granularity information about transcription completion status.
    • For giving a very concrete example, one might one to study evolution of hyphenation on a Wikisource corpus subset. But currently, the hyphenation is often dropped in the transcription process, and even when it is taken into account, there is no obvious way to query which transcriptions does that, or not, nor having an overview of the completion status for this criteria in the work completion overview.
      In this precise case, part of the problem might be solved through categories. For example, on the French Wikisource, there is the template Césure, which allow one to transcribe the text with hyphenation. It thereafter render the text hyphened when consulted in the Page namespace, and unhyphened otherwise like when it is transcluded in the main namespace. This template might add a category stating the page use it. However, also adding the level to which the page is completed regarding hyphenation criteria would be cumbersome, and it wouldn't allow quick overview of progression on this topic in the Livre (Work) namespace.
    • Additionally, this would avoid that pages stay in an "uncompleted" status when the transcription was done and reviewed but only the layout was not yet done to match the original page as close as possible. That's an interesting information. Indeed the transcription is not globally complete, but for a mere reading through the transclusion in the main namespace, that is wrong to state that the work is not complete.
  • Proposed solution:
    • Allow user to input status of transcription along an extensible set of parameters, like rates of sign matching, layout matching, and so on for stuff like tables and trees which might have a proper rendering but an improper html structure or the opposite.
    • Allow user to switch criteria in the transcription completion overview of the work
    • Possibly, a "global completion" criteria should provide a pondered mix of all existing criteria
  • More comments: This also pertains the remark of @Alex brollo: above about the true digitalization of a edition.
  • Phabricator tickets:

Discussion edit

Voting edit

Create new Han Characters with IDS extension for WikiSource

  • Problem: Han-character (en:logogram, include en:Chinese Characters, en:Hanja, and en:Kanji)- is widely used in East Asia (China, Taiwan, Singapore, Mandarin area in Malaysia, HongKong, Japan, Korea, Taiwan and Vietnam). An enduring problem unsolved for digital archiving is "lacking of characters". Not only for characters in ancient books, even modern publications lacks for characters ( i.e. Some authors may created 300-400 unique new characters in certain books). It's difficult to deal when we archive them into WikiSource. Unicode gradually add new characters into the chart, but new Uni-han extension always takes time to go live. In the past WikiSource,even Wikipedia, used to deal this problem with image files to present those characters. But images cannot be indexed, unsearchable, even not exchangeable between computer systems.
  • Who would benefit: Mostly the contributors and readers of Chinese Wikisource. However, if this way is available, all Wikimedia projects in languages that use Han characters will be benefited. (such as Japanese, Vietnamese, Korean, and Chinese dialects version like Classical Chinese, Hakka, Wu, or Gan., )
    1. Further more, even Wikipedia (Zh Wikipedia already used a lot of lacking characters,now .) and Wiktionary also are benefited.
    2. Other 2D composite characters writing system: For instance, Ancient Egypt and Maya.
  • Proposed solution: Unicode IDS -Ideographic Description Sequence- defined how to composite Han character with components. We implement the function to dynamically render Han character with Ideographic Description Sequences(IDS) and extension in WikiSource like: <ids>⿺辶⿴宀⿱珤⿰隹⿰貝招</ids> It will generate a Han character image file(now rendered on the temporary server on wmflabs ) with IDS in metadata. This is a solution to resolve lacking of Han characters problem on all C/J/K/V books. The basis is that Han characters are not as the same level as European alphabets,but words. Han characters are an open set. They are composited on 2 dimension by more basic components which owns basic element ,like "affix" in English (English words are composite on 1 dimension). In academies,components based Han character composite technology are developed and adapted to handle ancient Han books. The most famous are Academia Sinica 's development and cbeta Sutras plan. Recent years, opensource IDS renders are developed stable, so we can use the same technology to benifit Wikisource for handling Han ancient books as the same as those academies.
  • More comments:

Discussion edit

  • IMO there's no reason to limit this to Wikisource, as Wiktionary could also benefit a lot from this. NMaia (talk) 00:35, 28 November 2017 (UTC)[reply]
  •   Question: I support the general need to display unencoded characters. However, personally I think the quality of the generated characters is regretfully a bit substandard. Simply compressing each component together into a block is not aesthetic. Using images instead of web-fonts in this day and age is also suboptimal (even if it is SVG).
    The creator of this extension has probably poured their heart and soul into creating it, but may I suggest some sort of partnership with GlyphWiki instead? It is a website designed for hosting hanzi. Glyphs can be manually created and stored under IDS names, and the glyphs can be used in fonts. GlyphWiki supports generation of webfonts. Suzukaze-c (talk) 03:01, 3 December 2017 (UTC)[reply]

Voting edit

Offer PDF export of original pagination of entire books

  • Problem: Presently PDF conversion of proofread wikisource books doesn't mirrors original pagination and page design of original edition, since it comes from ns0 transclusion.
  • Who would benefit: Offline readers.
  • Proposed solution: To build an alternative PDF coming from conversion, page for page, of nsPage namespace.
  • More comments: Some wikisource contributors think that nsIndex and nsPage are simply "transcription tools"; I think that they are much more - they are the true digitalization of a edition, while ns0 transclusioni is something like a new edition.

Discussion edit

@Samwilson: Yes, perfect, thank you! --Alex brollo (talk) 07:49, 22 November 2017 (UTC)[reply]

Voting edit