Community Wishlist Survey 2019/Wiktionary/Wikidata module for translations
Wikidata module for translations
- Problem: Currently each wiktionary maintains a list of translations for each sense (Ex.1, Ex.2, Ex.3). These translations are not connected between language versions, so the effort is repeated in each language edition.
- Who would benefit: Wiktionary editors.
- Proposed solution: 1) Create a tool to import existing translation boxes from wiktionaries into Wikidata. 2) Create a module that can display all the translations in each Wiktionary that chooses to use it. 3) Allow to add more translations into Wikidata from each Wiktionary, so that the contributions are not repeated.
- More comments:
- Phabricator tickets:
- Proposer: Micru (talk) 14:44, 9 November 2018 (UTC)
Discussion
@Micru: Would this be solved by https://www.wikidata.org/wiki/Wikidata:Lexicographical_data and https://www.wikidata.org/wiki/Wikidata:Wiktionary ? --AKlapper (WMF) (talk) 20:58, 9 November 2018 (UTC)
- @AKlapper (WMF): That is part of the solution (the data would be stored in Wikidata as Lexicographical data), however as mentioned we need a way to import the translation lists from any Wiktionary into Wikidata and then a way to display it on any wiktionary who chooses to.--Micru (talk) 21:15, 9 November 2018 (UTC)
In translation lists, plenty links are in red, and Wikidata do not accept red links, so it could be problematic. Also, the nomenclature of definitions differ from one language to another, there is not script mapping from one language to another. To be clear, A Spanish word will have x definitions in English Wiktionary but y definitions in French Wiktionary, because each definition refers to a culture. So, how do you imagine those can be mapped? Noé (talk) 10:10, 10 November 2018 (UTC)
- @Noé: To import a translation list into Wikidata we need a Q-item representing the sense, and as many lexeme items (L-items) as words connected to that item. When there is a redlink it can be just a label in the Q-item in that language, without the need to create a lexeme. The nomenclature of definitions differ from one language to another, however the translation lists tend to be very similar and that is what matters. Normally the only difference between translation lists is that some language versions are more complete than others.--Micru (talk) 11:41, 10 November 2018 (UTC)
- The importation of a translation list from CC BY-SA Wiktionary to CC0 Wikidata imply the consideration of translation list as not covered by the licence and free of reuse without keeping the same licence (SA means share alike). This position is not consensual and I personally disapprove. If I understood Wikidata lexicographical data model, Q-item is for concepts, not for meanings. I think it had to be connected with S-item rather than Q-item. I am not sure to understand the solution you suggested for redlinks. In my experience, translations lists doesn't tend to be very similar. The mapping of senses to the reality is different from one language to another, so translation lists are not similar. You can choose to not deal with complex cases and focus on simple cases in a first step, but I think it is not very effective to oversimplify the complexity of translation -- Noé (talk) 10:25, 13 November 2018 (UTC)
- Yes! Let's focus on simple cases as a first step!--Micru (talk) 16:21, 17 November 2018 (UTC)
- The importation of a translation list from CC BY-SA Wiktionary to CC0 Wikidata imply the consideration of translation list as not covered by the licence and free of reuse without keeping the same licence (SA means share alike). This position is not consensual and I personally disapprove. If I understood Wikidata lexicographical data model, Q-item is for concepts, not for meanings. I think it had to be connected with S-item rather than Q-item. I am not sure to understand the solution you suggested for redlinks. In my experience, translations lists doesn't tend to be very similar. The mapping of senses to the reality is different from one language to another, so translation lists are not similar. You can choose to not deal with complex cases and focus on simple cases in a first step, but I think it is not very effective to oversimplify the complexity of translation -- Noé (talk) 10:25, 13 November 2018 (UTC)
I agree with @Noé: that translation lists from Wiktionaries can't be imported into a CC0 project. Thus said, any tool that might help to coordinate our word relationship lists between different linguistic vesion would be very welcome. So a simple solution that might meet both @Micru: proposition and Noé feedback is a Wikibase designed to store this lists while keeping exact record of license and origin (which Wiktionary version and page). That is, one should not only import the list of translation proposed for joy in the English Wiktionnary and joie in the French one, but also the list of given translations for joie' in the English version and for joy in the French one. No automatic merge of this lists should be performed, but dedicated tools to compare matching lists and possibly manually transfer items from one list to the other would be warmely welcome. Also having a way to query directly this lists from the wiktionnaries would be a nice plus. Psychoslave (talk) 03:38, 18 November 2018 (UTC)
There are cases where someone seems to have manually used a reciprocal translation to write a translation, and it doesn't really work well. Languages don't always map like that. "word1, language A" can be the closest translation of "word1 in language B", but "word2 in language B" might be closer to the meaning of "word1, language A". HLHJ (talk) 06:59, 18 November 2018 (UTC)
Voting
- Support Libcub (talk) 11:52, 17 November 2018 (UTC)
- Support Urhixidur (talk) 13:12, 17 November 2018 (UTC)
- Support Micru (talk) 16:20, 17 November 2018 (UTC)
- Support Giovanni Alfredo Garciliano Diaz (talk) 17:11, 17 November 2018 (UTC)
- Support Much needed such tools... JogiAsad (talk) 17:57, 17 November 2018 (UTC)
- Support Liuxinyu970226 (talk) 01:01, 18 November 2018 (UTC)
- Support, provided it would meet the feedback given by Noé and myself. If the intention is to merge that into Wikidata within a spacename not respect the license of the Wiktionnaries, I would however be strongly opposed. Psychoslave (talk) 03:38, 18 November 2018 (UTC)
- Oppose as the existing structure reflects the nature of the information (see discussion), and there are already cross-language links. HLHJ (talk) 06:59, 18 November 2018 (UTC)
- Support Sebastian Wallroth (talk) 13:18, 18 November 2018 (UTC)
- Support -Xbony2 (talk) 16:39, 19 November 2018 (UTC)
- Oppose It sounds easy only in theory. In realty, it sounds more than impossible (see discussion). And simple worlds like "tree" are complex cases. Otourly (talk) 10:16, 20 November 2018 (UTC)
- Oppose Not a good idea, only human must be to work on translation, an automatic system will generate more errors than a person. Lyokoï (talk) 15:02, 20 November 2018 (UTC)
- Oppose Same reasons. Lmaltier (talk) 18:28, 20 November 2018 (UTC)
- Support Thank you for the edit Nhatminh01 (talk) 11:20, 21 November 2018 (UTC)
- Support Novak Watchmen (talk) 15:24, 21 November 2018 (UTC)
- Support looks reasonable :) Gryllida 08:15, 23 November 2018 (UTC)
- Support Sahaquiel9102 (talk) 17:39, 23 November 2018 (UTC)
- Support — AfroThundr (u · t · c) 03:07, 26 November 2018 (UTC)
- Support TheIgel69 (talk) 11:08, 27 November 2018 (UTC)
- Support Merhad77 (talk) 14:02, 27 November 2018 (UTC)
- Support Peter Bowman (talk) 11:41, 28 November 2018 (UTC)
- Oppose I agree with Noé, HLHJ and Lyokoï. A word in a certain language doesn't have the exact same meaning in another language. If you dump everything together into one big pile of translations, you won't be able to distinguish between much and many, between ladder and staircase, between actor and actress, between tall and high, between to look and to watch, etc. I'd also like to note that some Wiktionaries have a huge amount incorrect translations (e.g. mg.wikt); blindly importing from those Wiktionaries will pollute the whole database in Wikidata. -- Curious (talk) 21:13, 28 November 2018 (UTC)