Talk:Abstract Wikipedia/Early function examples

Latest comment: 3 years ago by DVrandecic (WMF) in topic Missing function?

New function idea edit

—The preceding unsigned comment was added by GZWDer (talk) 12:31, 30 July 2020‎

  • readZObject: read a persistent ZObject with a specific ZID and return the quoted (see above) whole object (not only Z2K2/value)
  • createfunction: create an anonymous function from a quoted (possibly composite) function call

--GZWDer (talk) 15:25, 30 July 2020 (UTC)Reply

  • MultiReplace: return a ZObject, where all embedded objects with a specific type is replaced via execution of a specific function (e.g. [1, [2, [3, 4]]] -> result of f([1, f([2, f([3, 4])])]) --GZWDer (talk) 00:00, 22 August 2020 (UTC)Reply
@DVrandecic (WMF) and Quiddity (WMF): mutation in Breton.
So in Breton (and other Celtic languages) words can "mutate", the first letter(s) change agreeing with the word and the context. For instance "meter"@en is "metr"@br but "2 meters"@en is "2 vetr"@br (the "m" is softened into a "v", see en:Breton mutations).
I guess the function would be something like "1st string, 2nd string, gender ➝ string".
Mutation have some exception and sometime strange behaviour, but it's more to make sure I understand how function works and to cover the more common case.
Cheers, VIGNERON * discut. 22:57, 28 October 2020 (UTC)Reply
I agree, but mutations are even more complex, and should also include the case of contractions and elisions:
  • e.g. in French "le", "de", "ne", "je", "me", "te", "se" are contracted by changing the final mute "e" into an apostrophe, and gluing it to the next word, when it starts by a vowel or by a mute "h"). Similar patterns occur also in other Romance languages (notably in Italian). Such elision can occur before any grammatical class of term: nouns (e.g. "l'été"), verbs at any conjugation ("s'élève"), articles or numerals ("d'un"), adjectives ("l'autre partie"), pronouns ("s'en", "s'y"), prepositions ("d'à"), adverbs ("d'uniquement")...
    In informal speech, "tu" is frequently eliding: "tu as"->"t'as", "tu es"->"t'es". Less commmonly (but still informally) "il" or "elle" are elided to just "L'": "Il est"-> "L'est"; and "nous", "vous", "ils", "elles" are elided to "Z'" as well. Also "Il y a" or "Il n'y a" are contracted to "Y'a". These complex contractions are found as well in litterature, and it is very frequent in some common expressions: "Il n'y a qu'à..."->"Y'a qu'à...", "Tu n'as pas..."-> "T'as pas...". As well, speech may drop the impersonal leading subject "Il": "Il faut..."-> "Faut...", "Il ne faut pas..."->"Faut pas...".
  • the case of elisions before a mute "h" is irregular and depends on each word and can't be automated without a dictionnary lookup, i.e. you have to look at the phonetic of the word starting by an "h" to see if it's mute or aspirated, the "aspirated h" being often noted by a leading asterisk in phonetic notations, because "aspirated h" are also non pronounced at all in modern French, but still handled as if it was a glottal stop). This process is called "elision" and the apostrophe in French is always the mark of an elision and almost always required grammatically (except in some limited).
  • more complex mutations occur where the muted word may change in the middle, e.g. change one of its vowels without changing the leading letters, or can be replaced radically.
  • in French (also in Spanish, Italian...), such contraction (also required grammatically) is the contraction of the article with the previous preposition: "de le"->"du", "de les"->"des", "à le"-> "au", "à les"->"aux", but this contraction does not takes precedence to the elision (so "de le hôte" first honors the elision "de l'hôte", and not the contraction of the preposition "du hôte". Compare to "de le hérisson" where the "h" is aspirated and forbids the elision "l'hérisson", so we get "du hérisson". These cases however are very limited in French to specific (but very frequent) short preposition "de", "à" (It is very easy to automate it by a simple rule, except that, as you see, the test requires testing the effect on 3 words to see if the 2ns word requires the elision with the 3rd one, before deciding if we can legally contract the 2 first words!)
  • More complex cases involves the addition of purely phonetic particles between two words (such as "-t-") to create a pseudo-"liaison" but whose rôle is to avoid the juxtaposition of two words joining a vowel phoneme ending the 1st word, with the vowel phoneme starting the next word.: these were initially "pataquès" (incorrect liaisons in Middle French), that were later lexicalized (and even became mandatory in Modern French). That's something to know notably when transforming a sentence to the interogative case (which swaps the subject and verb, but also requires marking this inversion by an hyphen instead of a regular space: "Tu veux..."-> "Veux-tu...?". But see: "On a..."->"A-t-on...?", "Elle a..."->"A-t-elle...?" (with the inserted phonetic particle "-t-"). This case also occurs with some imperative verbs followed by some common vocalic pronouns: "Va-t'en!" (with the inserted "t'" is the reflexive pronoun, with an hyphen as it is inverted after the verb in pronominal form), but "Va-z'y !" (with the inserted particle "z'", with an hyphen as it is inverted after the non-pronominal verb, it is purely phonetic here and "y" is a pronoun for a locative indirect object, normally written before the conjugated verb or its auxiliary so there's also the need of the hyphen when it is inverted by the imperative form)...
verdy_p (talk) 08:20, 29 October 2020 (UTC)Reply
@Verdy p: elision is very different from mutation but it is still a very good point, it should also probably add as an example. My question is not so much about the details (we could also talk about the very few and weird case where "de le" stay "de le" in French ;) ) but more about the general architecture, how atomic or general should be functions, how should they interact with each other (which raises questions about dependancy and order for instance), how are defined condition or limit of application, etc. Cheers, VIGNERON * discut. 09:58, 29 October 2020 (UTC)Reply
Elisions are a subtype of contractions which themselves are also a subtype of mutations: they are all contextual, and almost always originate from the spoken language and adaptation of phonetics to what is perceived as being more "important" for correct understanding between people. Mutations can also include the omission of entire worlds, or their contextual replacement by pronouns (sometimes qualified or derived to disambiguate them, like "this" vs. "that", "this" vs. "those", "celle-ci" vs. "celle-là". Languages offer various way to keep the dependencies needed for minimal understanding, but they also tend to abbreviate many things to something more concise but still "clear enough" (according to the context).
The key point is that to process the natural language, we need a way to represent the current semantic context of use. There are also stylistic aspects, where repeating a term without abbreviating it becomes possible accoding to some distance or absence of ambiguity in the whole contect, or depending on text segmentation (i.e. sentences or whole paragraphs, or a whole section or whole article which implicitly describes a given topic.
Now there's also a difference between the formal speech and informal speech (where many more things may be abbreviated), and between speech and written text (which may also exhibit artificial features without real meaning, such as particles). The phonology also plays a role, notably when a language is spoken in contact with users of other languages. And there are also stylistoc conventions on orthography (sometimes not really justified, but coming from historic technical limitations that may no longer be relevant, or that could be relevant but are no longer the case, such as the differentiations of tones or vowel lengths or vowel quality, that various languages have lost, notably in modern French which is now very permissive on all these aspects, including on the placement of terms, i.e. the syntax, with lot of possible constructions that people will choose onkly because they "sound" better phonologically or because this allows them more expressiveness of their intent as an additional emphasis mean: contractions or expansion/repetitions are frequently a consequence of the speaker's intent, beside the merely exposed "facts"). verdy_p (talk) 10:20, 29 October 2020 (UTC)Reply

Other examples edit

from a chat w Ward

  • CamelCase to Snake_Case to Space Case
  • Removing (anything in parens) when matching names
  • Normalize whitespace
  • Expand abbreviations (from this dictionary)
  • Check for a match including an alias (from this dictionary)

Compare transforms in Data Drive Modeling section of this fedwiki:

SJ talk  18:19, 29 October 2020 (UTC)Reply

Interresting, it also reminds me of all cleaning functions of OpenRefine. Cheers, VIGNERON * discut. 18:59, 29 October 2020 (UTC)Reply

Missing function? edit

I don't see a function that can be extremely interesting to have: alphabetical order for a list. Is it possible to have it? What issues are to be considered with it?

Just to be clear: my current use case would be a narrow list of countries (let's say "Belgium", "Portugal" and "Spain"), whose alphabetical order would change between languages (in fact, in French, Spanish and Portuguese that order would be "Belgium", "Spain" and Portugal"). Let's say that I want to recall this list from Wikidata, and put it in alphabetical order in my language. Is it possible to have it with AW? --Sannita - not just another it.wiki sysop 15:56, 11 November 2020 (UTC)Reply

@Sannita: Yes, this is definitely the kind of thing that the wiki of functions can and will be used for. Note that there will eventually be many thousands of functions, and this page only has a tiny handful of examples. But sorting alphabetically, using different collations, names in different languages, etc., I very much expect that to be a function. The only issue might be possibly the length of the list to be sorted, and how efficiently that can be transported, but for short lists as you mentioned them this should be no issue at all. --DVrandecic (WMF) (talk) 17:43, 17 November 2020 (UTC)Reply

Return to "Abstract Wikipedia/Early function examples" page.