Talk:Community Tech/Ebook Export Improvement

Active discussions

Project Overview: Request for Feedback (May 2020)Edit

Hello, everyone! We invite you to read the content page of the project, which includes an analysis of the ebook export process and its primary issues, and share your feedback below. Thank you!

Have we covered the main reasons why people export ebooks?Edit

  • Yes, this is accurately covered and clearly explained. MartinPoulter (talk) 18:47, 28 May 2020 (UTC)
  • Great resume. In our actions, we should always keep in mind that there are two types of users: contributors but also visitors to Wikisource. We have to make sure visitors have a good experience exporting books to whatever device they have. --Viticulum (talk) 19:25, 28 May 2020 (UTC)
    • @MartinPoulter and Viticulum: Thank you for the feedback! Also, it's a great point that we'll need to be continually mindful of both contributors and visitors. While these two groups will have some overlapping needs in this project (such as being able to find & download books), the contributors may have greater familiarity with Wikisource. For this reason, it's important that we identify the largest problems with the user experience, so we can hopefully improve UX overall. Thanks again! --IFried (WMF) (talk) 15:03, 4 June 2020 (UTC)

Have we covered the main methods to export ebooks?Edit

  • Well explained. I have myself learned new thing.
    • In French Wikisource, we use mostly option "#4: Export via links at the top of text", and "#3: Export via links on the main page" when announcing new books.
    • We also have those links on the author page for each book: Ex.: See an author
    • My main concern is that external user find and understand easily how to export books. For "#2: Export via the left side panel" external user can't export a full book, and this can be misleading to them. --Viticulum (talk) 19:28, 28 May 2020 (UTC)
  • This depends on Wikisource. On czech WS we have only options PDF and EPUB, and no other in gadgets. On sk.WS there is only PDF option. Export should be avaliable on all langage versions for all users. JAn Dudík (talk) 11:30, 29 May 2020 (UTC)

@MartinPoulter, Viticulum, and JAn Dudík: Thank you for the feedback! It's also very helpful to be reminded of the fact that different wikis have different common practices, and some do not have as many options available. Ideally, we will want to improve overall user experience, so that: 1) users can easily discover how to download books, and 2) users have various options available to them, if possible, rather than being limited by one option. We'll investigate how we can improve this experience. Thank you again! --IFried (WMF) (talk) 19:57, 9 June 2020 (UTC)

Have we covered the main problems experienced when exporting ebooks?Edit

  • Yes, and I like that reliability was placed first as book export is so core to the functionality of Wikisource that it needs really high uptime. An observation that occurs near the end is crucial: "The WSExport tool is not easily discoverable, and it doesn't provide an intuitive user experience", yes this is Colleagues, who are intelligent enough and very familiar with the Web, have looked at this site and not grasped that any book on the site can be exported in a variety of formats, and it's easy to see why they miss that. MartinPoulter (talk) 18:47, 28 May 2020 (UTC)

Problem to export the image on the first page in pdf formatEdit

  • Pour moi le soucis le plus gênant : lors d'un export "pdf", dans la majorité des cas (voir par exemple sur le livre https://fr.wikisource.org/wiki/Le_Lorgnon_(Scribe), l'image de la page de garde ne s'affiche pas en première page, et un message d'erreur "Données insuffisantes pour une image" s'affiche à l'ouverture du fichier.

For me, the most frustrating problem while exporting an ebook in "pdf" format, in most cases (see for example for the book https://fr.wikisource.org/wiki/Le_Lorgnon_(Scribe), the image on the cover page does not appear, and an error message can be seen when opening the file (in french "Données insuffisantes pour une image" => trad. : insuffient data for an image). Thanks, Laurent --Lorlam (talk) 17:36, 28 May 2020 (UTC)

@Lorlam: Thanks for this comment. The link you provided doesn't seem to have an image on the first page, so I wasn't able to properly test/reproduce this issue. Do you have another example? Thanks! --IFried (WMF) (talk) 16:18, 11 June 2020 (UTC)
@IFried: The link given (https://fr.wikisource.org/wiki/Le_Lorgnon_(Scribe)) should have an image after exporting the book (see https://fr.wikisource.org/wiki/Livre:Scribe_-_Th%C3%A9%C3%A2tre,_13.djvu), but I could have others examples (https://fr.wikisource.org/wiki/Le_Codicille ... https://fr.wikisource.org/wiki/%C3%89chec_et_mat_(Feuillet) ... https://fr.wikisource.org/wiki/Monsieur_de_Chimpanz%C3%A9). Thanks, --Lorlam (talk) 18:02, 11 June 2020 (UTC)
@Lorlam: Thanks for the information! We have tested this, and we noticed the following: When we first downloaded the book, the first cover page was blank. However, when we looked at the download a day later (in one case), the cover page had changed to display the expected content. The page was red-colored, which was not expected, but the content and imagery looked fine. Does this reflect your experience? --IFried (WMF) (talk) 23:00, 24 June 2020 (UTC)
@IFried: Interesting ! A red colored image instead of a Grayscale image ! This may indicate a bad object definition in the pdf structure and insufficient data to render the Green and Blue channels. The Object definition for the Scribe cover should be: <</Type/XObject/BitsPerComponent 8/ColorSpace/DeviceGray/DL 22383/Filter[/DCTDecode]/Height 565/Length 22383/Subtype/Image/Width 400>> instead of <</Type/XObject/BitsPerComponent 8/ColorSpace/DeviceRGB/DL 22383/Filter[/DCTDecode]/Height 565/Length 22383/Subtype/Image/Width 400>> as written by Calibre. As mentioned, a workaround is to convert 8bits grayscale cover to 24bits True color.--Denis Gagne52 (talk) 15:37, 27 June 2020 (UTC)
@Denis Gagne52: Ah, interesting; thanks for providing that potential explanation (which I have noted). Much appreciated. --IFried (WMF) (talk) 21:57, 7 July 2020 (UTC)
@IFried: Sorry, for me, for books which have problems, the cover page never displays (for information, I use Acrobat Reader to open pdf files) --Lorlam (talk)
@IFried:I had information about this problem which is described on Github https://github.com/wsexport/tool/commit/2fcf826411ff97b60fa4bf78e91092d046800302Github cu --Lorlam (talk) 15:58, 26 June 2020 (UTC)
@Lorlam: Thanks for getting back to us and providing this additional information. We'll check it out. One thing: It seems like the Github link that you provided didn't work (404 error when I tried to access it). Can you try to share it again? Thanks! --IFried (WMF) (talk) 21:54, 7 July 2020 (UTC)
@IFried: Yeh ! it is https://github.com/wsexport/tool/commit/2fcf826411ff97b60fa4bf78e91092d046800302 cu --Lorlam (talk) 00:10, 8 July 2020 (UTC)
@Lorlam: Thank you; much appreciated! --IFried (WMF) (talk) 22:06, 10 July 2020 (UTC)
This seems a side effect of changes made earlier in 2020, as this was not happening before. --Viticulum (talk) 19:33, 28 May 2020 (UTC)
Yes, this problems appears in february 2020 --Lorlam (talk) 20:43, 28 May 2020 (UTC)

@MartinPoulter, Lorlam, and Viticulum: Thank you for this information! We completely agree that the user experience is not optimal, and we hope to improve it (both for experienced editors/readers & newcomers). Also, thanks for the information regarding the PDF export issue, which seems to have appeared around February 2020. I'll share this information with the team & see if we can investigate. --IFried (WMF) (talk) 20:24, 9 June 2020 (UTC)

Which formatting and style issues are the most common and frustrating, in your opinion?Edit

  • Many a times we add image using the crop image tool. For the web view it is okay. But if we try to download the book instead of cropped image the whole image of the page is downloaded in the book.
  • Tables are not rendered properly many a times in the downloaded book
  • sfrac template is not rendered properly in downloaded book.

--Balajijagadesh (talk) 18:27, 27 May 2020 (UTC)

@Balajijagadesh: Thank you for this information! One question: When we conducted some basic tests, the fractions in ebook exports looked okay. Maybe you can provide some examples of the sfrac template issue, which we can use for analysis? Thanks! --IFried (WMF) (talk) 23:07, 24 June 2020 (UTC)
@IFried (WMF): Hi. Thanks for reaching out. The sfrac template is rendered properly in pdf and epub formats. But is not rendered properly in mobi format. The horizontal bar disappears and the introduces alignment problem. Let me know if you can reproduce the problem. Regards -- Balajijagadesh (talk) 13:50, 2 July 2020 (UTC)
@Balajijagadesh: Thank you for sharing this information! We have tested this issue on mobi, and we have been able to reproduce the issue. I have written a ticket for this. Appreciate it! --IFried (WMF) (talk) 22:22, 10 July 2020 (UTC)
  • While converting gujarati Ebooks using WSExport into mobi format, the text is printed right half of the page. --Sushant savla (talk) 11:10, 28 May 2020 (UTC)
@Sushant savla: Thank you for this feedback! We think we captured this issue in our example #5 on the project page. Is this correct/is this the same issue that you are describing? Thanks! --IFried (WMF) (talk) 23:09, 24 June 2020 (UTC)
  • I have found that the main problem with "Download as PDF" is fonts. When special fonts are used, especially those that support diacriticals, the output is not always rendered in the same font. Rather, a standard font is sometimes used, one which does not support the diacriticals. There are also sometimes unexpected changes to font size that can ruin the formatting. Dovi (talk) 12:12, 28 May 2020 (UTC)
@Dovi: Thank you for providing this information! We have some follow-up questions (so we can better understand the problem). Our questions: Can you provide an example of where you are seeing this issue? And how are you downloading the PDF? Is it via the side-panel (and, therefore, via ElectronPDF) or via the top panel (and, therefore, via WSExport), or somewhere else? Thanks in advance! --IFried (WMF) (talk) 23:13, 24 June 2020 (UTC)
    • Slightly related: I do not see the "Choose format" option in Hebrew Wikisource. How can it be enabled? Dovi (talk) 12:18, 28 May 2020 (UTC)
@Dovi: Hello! The ability to see "Choose format" should be available, if WSExport is enabled, on the wiki. If you want to enable it in the sidebar, you can try to contact someone with interface admin rights on your wiki in order to enable it in the sidebar. --IFried (WMF) (talk) 23:15, 24 June 2020 (UTC)
  • (Perhaps this should be in a new section, feel free to move): The first example from enWS is, in my opinion, not a good example. The markup at enWS was using the <center> tag, and it wasn't using the s:en:Template:page break template, which inserts some CSS-styled div to produce a page break in ereaders (break-after:page; page-break-after:always;). So, I think these issues are not really the fault of the WS-export tool, but rather an issue that should be fixed at enWS. Perhaps WS-export could spot "suspect" markup and make a best-effort attempt to hotfix them during export, but that would mask the underlying issue of poor markup at the source and offload the burden onto the WS-export maintainers. Inductiveload (talk) 10:53, 29 May 2020 (UTC)
@Inductiveload: Thanks for sharing this information; it was very helpful. We can see, like you wrote, that the example is due to incorrect markup (i.e., template:pagebreak should have been used instead of <center> tag). In this case, the issue seems to be community outreach and education rather than a technical issue. However, we still want to document that this is happening, so that we can inform our communities how to mitigate these issues when they export books using WSExport. We'll also look into adding more details on the project page about this. Thanks again! --IFried (WMF) (talk) 23:18, 24 June 2020 (UTC)
  • I tried to export several books with WSexport tool. And the biggest issue was - metadata. On cs.wikisource we have on all content pages infobox with information about author, source, licence etc. And the same table was at the beginning of every chapter in exported book. There should be option to hide these informations on export and have them only once in text. JAn Dudík (talk) 11:50, 29 May 2020 (UTC)
@JAn Dudík: Thanks for the feedback! While we see that someone provided a solution to the metadata issue with ebook exports, we also understand that there are other issues, and we hope to improve the ebook export experience overall. Furthermore, we see that there’s an issue with encoding in external hyperlinks, which we've noted. Thanks! --IFried (WMF) (talk) 23:32, 24 June 2020 (UTC)
  • @JAn Dudík: The support for WSExport on cs.wikisource is very poor. If cs.wikisource community wants good exported e-books it would unfortunately require lot of changes there. Hiding metadata table is one of simple changes. --EBookian (talk) 20:35, 29 May 2020 (UTC)
  • @EBookian: And is somewhere documenation what to do for better support? JAn Dudík (talk) 20:49, 29 May 2020 (UTC)
  • @JAn Dudík: WSExport is quite simple tool which takes some pages and translates them into e-book, there is not much to document while it surely lacks in some areas. You added microformat there which is good thing. On the other hand cs.wikisource heavily relies on those metadata tables at the moment and if you exclude them from export now you will see no divide between chapters. You need to unify the style of pages, create e-book CSS, ... I am getting out of scope of this page, if you wish we can continue this talk somewhere else. --EBookian (talk) 21:20, 29 May 2020 (UTC)
@JAn Dudík: Thanks for bringing up this question about documentation! We also see that improved documentation of best practices can help people encounter less confusion and errors. We’re currently looking into how to do this, and we’ll update the project page when we have information. --IFried (WMF) (talk) 22:06, 7 July 2020 (UTC)
  • Now I tried to read one exported book on my mobile app and I found that there is problem with encoding in external hyperlinks - instead of UTF-8 is link probably in latin-2 (instead of cs:s:Autor:Věnceslav Černý i got link to Autor:VÄ›nceslav_ÄŚernĂ˝) JAn Dudík (talk) 20:49, 29 May 2020 (UTC)
@JAn Dudík: Thanks for this information. In order to better understand the problem, we have a few questions: 1) When you say you are using the mobile app, what do you mean, exactly (since there is no Wikisource app?). Are you using the mobile view of a desktop browser, for example? 2) Did you use the download PDF button on this page (we are asking because this link uses ElectronPDF rather than WSExport)? Thanks! --IFried (WMF) (talk) 22:10, 7 July 2020 (UTC)
@IFried (WMF): I used wsexport for generating epub file from cs.wikisource book. Then I copy it to my mobile and opened using Cool Reader app (but you can imagine any other e-book reader). Text of book and images were correct, but external link from infoboxes were with bad encoding. JAn Dudík (talk) 09:22, 8 July 2020 (UTC)
  • While converting the text from wikisource into pdf or rtf, the text is indented at the start of the every paragraph. It even indents the first line of the poem even if it is enclosed under poem tag. So the output for poems are bad spoiling all the alignment for the poems. The poems are not indented in epub or mobi format. The issue can be seen here -- Balajijagadesh (talk) 07:06, 3 July 2020 (UTC)
@Balajijagadesh: Thanks for this feedback! We have tested this issue on epub, pdf, and mobi. As you wrote, the pdf version had incorrect indentation. The mobi version had the numbers smashed into the text, which also looked strange. The only version that looked okay was epub. We have written a ticket to track the issue, and we’ll see if we can look into this. In addition, we are beginning to investigate the best practices for proofreading content to Wikisource. Once we share these findings, we hope it can help prevent some formatting and styles issues in the future. Thanks! --IFried (WMF) (talk) 22:03, 10 July 2020 (UTC)
@Balajijagadesh: Thank you for bringing this up! We have covered this issue in example #2 on the project page, and we agree that this is a big problem. We really hope that we can fix it, and we have begun investigating how we may be able to do this. Thanks again and we hope to provide updates on this issue soon. --IFried (WMF) (talk) 22:05, 10 July 2020 (UTC)

Which user experience issues are the most common and frustrating, in your opinion?Edit

  • Many a times the downloading time of the book is so much that people close the page. Many times the wsexport tool doesnt work. -- Balajijagadesh (talk) 18:28, 27 May 2020 (UTC)
  • See mw:Bug management/Triage/201410. In general, things only got worse since 2014, so everything that applied back then is still valid. You need to study the relevant components in Phabricator. Nemo 13:42, 29 May 2020 (UTC)
@Balajijagadesh and Nemo bis: Thank you for the feedback on the most frustrating UX issues! This is helpful and we will take a look. --IFried (WMF) (talk) 22:09, 10 July 2020 (UTC)

Which problems, overall, do you find the most critical to fix, and why?Edit

  • Since the latest version, WSExport is slower than before. External visitors may not be patient if system too slow (they may think it is not working). When time-out is reach, message is not user-friendly for external visitors. --Viticulum (talk) 19:31, 28 May 2020 (UTC)
@Viticulum: Thanks for the feedback! One question: What is the latest version you are referring to? Also, thanks for the comment about the need to improve user-friendly messaging (we’ll look into it). --IFried (WMF) (talk) 22:11, 10 July 2020 (UTC)
  • We need multi-year reliability. Multi-page export needs to be provided by a MediaWiki extension again to all the formats people need: PDF and EPUB at a minimum (but when you support EPUB, it's easy to add ZIM and ODT as well). The development and maintenance extension needs to be outsourced to a third party, with sufficient funding for at least 5 years, so that users and partners (for instance libraries) can be sure that it will keep existing in the future and not vanish overnight if a couple persons at WMF decide so. Without a reliable export, it's impossible to get national libraries and the various access methods to bring users to Wikisource. Nemo 13:45, 29 May 2020 (UTC)
@Nemo bis: Thanks for the feedback! Just to make sure we understand your comment, can you clarify what you mean by “multi-year reliability?” To your other point, we agree that Wikisource should have more standardized and easily accessible tools and gadgets. For this reason, we will be working to improve this issue, especially through the ‘Migrate Wikisource specific edit tools from gadgets to Wikisource extension’ wish. Finally, to your point regarding maintenance: While the Community Tech team will not be maintaining Wikisource, overall, in a long-term capacity, we are hoping to increase the overall health and usability of Wikisource, so that it is easier to maintain in the future. --IFried (WMF) (talk) 22:13, 10 July 2020 (UTC)

Anything else you would like to add?Edit

  • I would like the developers/technical team to pay attention to eBooks in RTL languages. These are written right-to-left (E.g., Hebrew and Arabic). I hope the Export tool will also support such languages. From past expreience, such support is not automatic, and special care is needed to ensure this.--Naḥum (talk) 12:14, 28 May 2020 (UTC)
@Nahum: Thanks so much for this feedback! We would love to learn more about the issues and challenges unique to RTL users on Wikisource, especially regarding ebook exports. Can you provide more details? We agree that this should be looked into as well, so we look forward to your response. --IFried (WMF) (talk) 22:15, 10 July 2020 (UTC)

Modernisation does not exportEdit

Hi ! One issue with the export is that the modernisation system that we use, at least in the fr.wikisource, does not work in exported formats because its in JS. But it cause very unpleasant reading of old texts who have been transcribed in the original version then modernised with the modernisation system. Its very convenient to use on wikisource itself but very disappointing with the export. --M0tty (talk) 12:00, 28 May 2020 (UTC)

See this example [1] for modernisation of old French: On middle/left there is "Orthographe originale" or "Orthographe moderne". This is done for each chapter. It is not possible to extract a chapter or the whole book in modernised French. This functionality is not incorporated in WSExport. I believe this would be a whole project in itself. Tpt could give more insight. --Viticulum (talk) 19:46, 28 May 2020 (UTC)
Ideally, as it seems possible to include some Javascript in an ePub, it would be great if the ePub file could contain both versions and switch from one to the other using exactly the same Javascript code as in the French Wikisource. However it's possible that this would require to load not only the "local" replacements present as a parameter of the modernisation model, but also the entire Wikisource modernization dictionary, or at least the subset of words which are found in the exported text. --George2etexte (talk) 14:13, 2 June 2020 (UTC)
@M0tty, Viticulum, and George2etexte: Thanks for sharing this information. From my understanding, you are writing about the fact that Wikisource readers online can choose which orthography to select, but this is not available for ebook exports. Is this correct? And, if so, can you provide a bit more explanation and context around it (for example, do you know if there is already a Phabricator ticket that documents this problem)? The fix for this may be a large project that is out of scope for the current project. However, it’s good for us to still know about this issue, and we would like to document it in Phabricator. We look forward to your response. Thanks! --IFried (WMF) (talk) 22:17, 10 July 2020 (UTC)

Math exportEdit

  • Currently on the different wiki, it is possible to activate MathML to have a nice render of mathematical formulas instead of vectorial images. The current export process does not allow to have this MathML format and include all mathematical formulas as images, like the old mediawiki way. MathML being now vastly handled, it would be really useful to be able to export the code with MathML. — Alan Talk 13:16, 28 May 2020 (UTC)
@Nalou: Thanks so much for this information! As a first question, can you let us know a bit more about how you activate and use MathML (with an example, preferably)? If we understand correctly, you are writing about the inability to use math markup in Wikisource. For this reason, users need to employ tactics that aren’t ideal, such as capturing an image of a formula with the crop tool. Is that correct? Thanks! --IFried (WMF) (talk) 22:19, 10 July 2020 (UTC)

Wrong date order for exports in french langageEdit

  • Le format de la date est inversé (mois/jour/année) comme c'est la norme en anglais, par exemple aujourd'hui : "Exporté de Wikisource le 05/28/20" => c'est bizarre tout de même d'avoir le commentaire "Exporté de…" en français avec un format de date au format "anglais"

For book exports in french Wikisource, the date order is inverted (month/day/year) as it is the rule in english. For example for today : "Exporté de Wikisource le 05/28/20" => But it is strange to have the comment "Exporté de…" in french, with a wrong date order, as it is the rule in english (in french the date order in day/month/year), so, in french, we are today the 28/05/20 (and not the 05/28/20). Thanks, Laurent --Lorlam (talk) 17:48, 28 May 2020 (UTC)

=> Ok now, the problem has been fixed. --Lorlam (talk) 00:39, 25 June 2020 (UTC)
@Lorlam: Thanks for reporting this issue! As you wrote, the issue appears to be fixed in some cases. However, we still see this issue arising in other cases, such as in Tamil exports, so we’ll look into this. One possible solution may be to display the name of the month rather than the number. Thanks! --IFried (WMF) (talk) 22:24, 10 July 2020 (UTC)

Bad export in "pdf" for french civility titlesEdit

  • L'outil d'export en "pdf" ne sait pas traiter les modèles de civilité entre accolades "M." / "Mlle" / "Mme" / "Mmmes" / etc… et on obtient une sortie "pdf" pas très jolie ou les caractères sont soulignés en pointillés ce qui ne les rend pas très lisibles…

The "pdf" export tool does not export correcty french civility titles that we use in french Wikisource (under embrace "M." for Monsieur / "MM." for Messieurs / "Mlle" for Mademoiselle / "Mme" for Madame / "Mmes" for Mesadames / etc…). The export in "pdf" shows caracters underlined with a dotted line, which is not well readable... (example for the distribution list of the play https://fr.wikisource.org/wiki/Un_gros_mot) Thanks, Laurent --Lorlam (talk) 18:16, 28 May 2020 (UTC)

  • To add a clue for this problem, someone in french wikisource said it is a problem with the {{abréviation}}
model (see here https://fr.wikisource.org/wiki/Mod%C3%A8le:Abr%C3%A9viation), and all others models which uses it. All these "civility titles" models are described here : https://fr.wikisource.org/wiki/Cat%C3%A9gorie:Mod%C3%A8les_de_titre_de_civilit%C3%A9 ... thx --Lorlam (talk) 21:00, 28 May 2020 (UTC)
=> This problem has been fixed by modifying the model in french Wikisource, so Okay now ;-) --Lorlam (talk) 21:10, 31 May 2020 (UTC)
@Lorlam: Thanks for reporting this! It appears that this issue has been fixed, as you have written. However, if this issue arises again, please do let us know. Thank you! --IFried (WMF) (talk) 22:25, 10 July 2020 (UTC)

e-book navigationEdit

The way Table of Contents is translated into e-book navigation (I mean e-book reader navigation, not ToC that would be printed) is very limited. It would be beneficial if there was a way to allow editors to change the structure of e-book navigation to align better with the book structure (probably by some ToC tags). --EBookian (talk) 20:58, 29 May 2020 (UTC)

@EBookian: Thanks for the information! Can you provide more details on this problem (perhaps a specific example of where you are seeing this problem)? This will help us understand the problem better. Much appreciated! --IFried (WMF) (talk) 22:26, 10 July 2020 (UTC)

Long chapters and footnotesEdit

I have observed on several occasions the case of footnotes in books with no chapters or with chapters exceeding 80 or 100 pages. The wsexport epub tool arbitrarily splits the chapter and breaks the links to the footnotes, forcing you to artificially split the chapter to get around the problem. See exemple in Histoire de l'affaire Dreyfus T.2)--Cunegonde1 (talk) 03:36, 30 May 2020 (UTC)

@Cunegonde1: Thank you for this information! We have conducted some basic tests on EPUB and PDF to try to reproduce the splitting and link problems. However, we were unable to reproduce the issues. The footnotes appeared to properly display at the end of the chapter with linking functionality. Perhaps you can share a screenshot and more details that demonstrate the issue? This will help us understand the problem better and see if it is something we can fix. Thanks! --IFried (WMF) (talk) 22:28, 10 July 2020 (UTC)
@IFried: You can see the issue on this book : Sade - histoire_de_Juliette, if you create the epub and edit it, you can see that the first footnote call is localised on chap : c1_L_histoire_de_Juliette_premiere_partie.xhtml, page 62, and the text of footnote is localised on a chapter call : c1_L_histoire_de_Juliette_premiere_partie_2.xhtml, the link beetwin the footnote call and the footnote text is : <a xmlns:epub="http://www.idpf.org/2007/ops" href="#cite_note-1" epub:type="noteref">[1]</a> is not pointing to the chapter where is the text of footnote. Excuse my poor english. Abstract in french : Le lien entre l'appel de note et la note elle-même ne fonctionne pas, l'appel de note se trouve dans une section de l'epub et la note elle même dans une autre sans qu'il y ait un lien pointant vers cette section.--Cunegonde1 (talk) 06:53, 11 July 2020 (UTC)

Prevent page breaks after headingsEdit

Je voudrais signaler aussi des sauts de pages intempestifs, typiquement entre un titre de section et le texte de la section, quand celle-ci ne commence pas sur une nouvelle page (par exemple, dans l’epub exporté à partir de cet ouvrage de Gauss, dont les chapitres sont eux-mêmes divisés en courts articles, comme on peut le voir sur cet exemple, le numéro de l’article et le début de l’article se trouvent souvent sur deux pages séparées). Sur le Wikisource français, des modèles ont été créés justement pour la mise en forme des titres et leur hiérarchisation, de {{t2}} à {{t6}}, à partir des balises HTML h2 à h6. Pourrait-on modifier certains paramètres, de ces modèles ou de l’export, pour empêcher un saut de page entre un tel titre et le début de la section, quel que soit le nombre de retours chariot qui le suit dans le code ?

I would like also to draw your attention on some inappropriate page breaks, basically between a section heading and the text of this section, especially when this section does not begin on a new page (for example, in this Gauss' work, whose chapters are themselves divided in small articles designated by numbers, as you can see here, the article number and the beginning of the article are frequently separated in the epub by a page break). On French Wikisource, some templates, namely {{t2}} to {{t6}}, are specifically designed to specify the style and the hierarchy of headings (based on HTML h2 to h6). Could these models or the export tool be modified to prevent page breaks after headings, whatever the number of carriage returns following it in the code ?ElioPrrl (talk) 15:27, 30 May 2020 (UTC)

Did you consider to use these tags:
  • <div style = "page-break-inside: avoid;"> <! - Beginning of the block: Skip page to avoid ->
  • Your text-block…
  • </div> <! - End of block: Skip page to avoid -> This could be encapsulated in a template --Denis Gagne52 (talk) 16:27, 27 June 2020 (UTC)
@ElioPrrl: Thanks for reporting this issue! We have tested this issue, and we were able to reproduce it. We also see that the possible inclusion tags (as describe above by Denis Gagne5) could fix this issue. Can you let us know if the issue is indeed fixed by the tags, or no? Thanks! --IFried (WMF) (talk) 22:31, 10 July 2020 (UTC)

InitialsEdit

Initials built with the lettrine model in the French Wikisource are not properly displayed in the ePub exported files (see e.g. this play). It's a bit better in the PDF exports (the font size is a bit larger than the text, although it does not seem to adapt to the number of lines given in the « lignes= » parameter of the model). --George2etexte (talk) 14:13, 2 June 2020 (UTC)

@George2etexte: Thank you for letting us know about this! We have conducted some basic tests. In our tests, we found that the lettrine was represented better in EPUB than PDF, but we understand that there may be different experiences on different devices. We may not have capacity to fix this issue, but we have noted it. Is there a Phabricator ticket for documentation purposes? If not, would you like create one and tag us? Thanks! --IFried (WMF) (talk) 22:32, 10 July 2020 (UTC)

Cropped image handlingEdit

Telugu wikipedia extensively used cropped scan to represent images or figures in text as in example page. Current wsexport handles it well and we would like this functionality to be handled in future. --49.206.8.248 04:50, 3 July 2020 (UTC)

@49.206.8.248: Thank you! We are happy to hear that WSExport handles cropped images well on Telugu Wikisource (we assume you meant Wikisource?). However, we also know that there are cropped image issues experienced by other users, so we’ll see if there is something that we can do to improve this issue. Thanks again for commenting. --IFried (WMF) (talk) 22:07, 10 July 2020 (UTC)
Return to "Community Tech/Ebook Export Improvement" page.