Talk:Abstract Wikipedia

Active discussions

Sub-pagesEdit

Why was this approved so quickly?Edit

Wikispore, Wikijournal, and Wikigenealogy all have potential. Wikifunctions is just another project that will sit near-empty for years like Wikispecies has, and should be part of Wikidata anyway. 2001:569:BD7D:6E00:F9E8:8F6F:25D1:B825 01:38, 19 June 2021 (UTC)[]

I kind of agree. If lexemes were just added as a separate namespace to Wikidata, why weren't functions? ~~~~
User:1234qwer1234qwer4 (talk)
11:41, 19 June 2021 (UTC)[]
From my point of view one difference between Wikifunctions and for example lexems is, that Wikifunctions will offer calculation resources so that some of the calculations can be made on the platform directly and it is not needed to run a function locally. As far as I understand it can be also used to centralize templates that are currently defined locally in the different language versions of Wikipedia and the other Wikimedia projects. So there is maybe a technical aspect why Wikifunctions is an own sister project.
Wikidata has a lot of contenct so I think it can happen that is not so easy to find something for a user. I have sometimes problems to find Lexems after I need to change the search for that, so that the search goes also to the Lexemenamespace. So I hope that it is easier also for external users that they can find the content if Wikifunctions is an own project. To make sure that it will not sit near-empty for years is from my point of view a big challenge. As far as I understand the project was approved because there is the hope that it can help making knowledge accessible in more languages. I dont know if this works also for small languages but I hope that it will work and I think it is important to work on it for the next years that it can become reality. How to make Wikifunctions accessible for many people is an important question and I hope that there are more discussions about it in the next weeks. The Wikimania this year is a chance to talk about Wikifunctions also with people who speak small languages.--Hogü-456 (talk) 19:54, 20 June 2021 (UTC)[]
Yes, as Hogü-456 describes, those are among the reasons why the project is better as a separate wiki, and some of the goals of the project. They require very different types of software back-ends and front-ends to Wikidata, and a significantly different skillset among some of the primary contributors. I might describe it as: Wikidata is focused on storing and serving certain types of structured data, whereas Wikifunctions is focused on running calculations on structured data. There are more overview details in Abstract Wikipedia/Overview that you might find helpful. Quiddity (WMF) (talk) 22:19, 21 June 2021 (UTC)[]
@Hogü-456: Re: problems searching for Lexemes on Wikidata - If you prefix your searches there with L: then that will search only the Lexeme namespace. E.g. L:apple. :) Quiddity (WMF) (talk) 22:23, 21 June 2021 (UTC)[]
this was asked and answered in Talk:Abstract_Wikipedia/Archive_2#How_was_it_approved and Talk:Abstract_Wikipedia/Archive_2#A_few_questions_and_concerns. --QDinar (talk) 13:21, 17 September 2021 (UTC)[]
see also Talk:Abstract_Wikipedia/Archive_2#Google's_involvement and Talk:Abstract_Wikipedia/Archive_2#Confusion, these are suspicions about google's involvement, and answers. --QDinar (talk) 21:10, 18 September 2021 (UTC)[]

you are creating just a new language just like any of existing natural languages. this is wrong.Edit

https://meta.wikimedia.org/wiki/Abstract_Wikipedia :

In Abstract Wikipedia, people can create and maintain Wikipedia articles in a language-independent way. A particular language Wikipedia can translate this language-independent article into its language.

https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Examples :

  Article(
    content: [
      Instantiation(
        instance: San Francisco (Q62),
        class: Object_with_modifier_and_of(
          object: center,
          modifier: And_modifier(
            conjuncts: [cultural, commercial, financial]
          ),
          of: Northern California (Q1066807)
        )
      ),
      Ranking(
        subject: San Francisco (Q62),
        rank: 4,
        object: city (Q515),
        by: population (Q1613416),
        local_constraint: California (Q99),
        after: [Los Angeles (Q65), San Diego (Q16552), San Jose (Q16553)]
      )
    ]
  )

* English : San Francisco is the cultural, commercial, and financial center of Northern California. It is the fourth-most populous city in California, after Los Angeles, San Diego and San Jose.


you are creating a new language just like other natural languages. and it is worse than the existing natural languages, because it is going to have many functions like "Object_with_modifier_and_of", "Instantiation", "Ranking".

an advantageous feature, compared to natural languages, shown there, is that you link concepts to wikidata, but that can be done also with natural languages. other good thing here is structure shown with parentheses, but that also can be done with natural languages. so, there is nothing better in this proposed (artificial) language, compared to natural languages.

i think that, probably, any sentence of any natural language is semantically a binary tree like this:

(
	(
		(San Francisco)
		(
			be
			(
				the
				(
					(
						(
							(
								(culture al)
								(
									,
									(commerce ial)
								)
							)
							(
								,
								(
									and
									(finance ial)
								)
							)
						)
						center
					)
					(
						of
						(
							(North ern)
							California
						)
					)
				)
			)
		)
	)
	s
)
.

(
	(
		it
		(
			be
			(
				(
					the
					(
						(four th)
						(	
							(
								(much est)
								(populous city)
							)
							(in California)
						)
					)
				)
				(
					,
					(
						after
						(
							(
								(Los Angeles)
								(
									,
									(San Diego)
								)
							)
							(
								and
								(San Jose)
							)
						)
					)
				)
			)
		)
	)
	s
)
.

some parts of text can be shown several ways as binary trees. for example:

((San Francisco) ((be X) s))

(((San Francisco) (be X)) s)

fourth
(	
	(
		most
		(populous city)
	)
	(in California)
)

(
fourth
most
)
(	
	
	(populous city)
	(in California)
)

(
	fourth
	(	
		(
			most
			(populous city)
		)
	)
)
(in California)

fourth
(	
	(
		(
			most
			populous
		)
		city
	)
	(in California)
)


--QDinar (talk) 12:31, 17 August 2021 (UTC) (last edited 06:44, 6 September 2021 (UTC))[]

creating a new language is a huge effort, and only few people are going to know it. you have to discuss different limits of every word in it to come to some consensus... and all that work is just to create just another language no better, by its structure, than existing thousands natural languages. (lexicon can be bigger than of some languages). what you should do instead is just use a format like this for every language and use functions to transform it to usual form of that language. also, speach synthesis can be done better using the parentheses. also you can transform this formats from language to language. --QDinar (talk) 12:55, 17 August 2021 (UTC)[]

i think, any paragraph, probably, also can be structured into binary tree, like this, and i make a tree of mediawiki discussion signature, for purpose of demonstration of the binary tree concept:

(
	(
		(
			creating a new language is a huge effort, and only few people are going to know it.
			you have to discuss different limits of every word in it to come to some consensus...
		)
		(
			and all that work is just to create just another language no better, by its structure, than existing thousands natural languages.
			(lexicon can be bigger than of some languages).
		)
	)
	(
		(
			(
				what you should do instead is just use a format like this for every language and use functions to transform it to usual form of that language.
				also, speach synthesis can be done better using the parentheses.
			)
			also you can transform this formats from language to language.
		)
	)
)
(
	--
	(
		(
			QDinar
			("()" talk)
		)
		(
			(
				12
				(: 55)
			)
			(
				,
				(
					(
						(17 August)
						2021
					)
					(
						"()"
						(
							(U T)
							C
						)
					)
				)
			)
		)
	)
)

(the regular sentences are intentionally not structured into binary tree in this example). this structure can be useful to better connect sentences via pronouns. and different languages may have different limits and preferences in using one sentence vs several sentences with pronouns. this parentheses may help to (properly) translate that places to other languages. --QDinar (talk) 13:22, 17 August 2021 (UTC)[]

since this is like out of scope of the Abstract Wikipedia project, i have submitted this as a project: Structured text. --QDinar (talk) 19:20, 18 August 2021 (UTC)[]

@Qdinar: Yes, you're right, in a way we are creating a new language. With the difference, that we are creating it together and that we are creating tools to work with that language. But no, it is not a natural language, it is a formal language. Natural languages are very hard to parse (the structures that you put into your examples, all these parentheses, were done with a lot of intelligence on your part). The other thing is that a lot of words in natural languages are ambiguous, which makes them hard to translate. The "sprengen" in "Ich sprenge den Rasen" is a very different "sprengen" than the one in "Ich sprenge die Party". That's why we think we need to work with a formal language, in order to avoid these issues. I don't think that you could use a natural language as the starting point for this (although I have a grammar of Ithkuil on my table right now, and that might be an interesting candidate. One could argue whether that's natural, though). --DVrandecic (WMF) (talk) 20:59, 3 September 2021 (UTC)[]

"Natural languages are very hard to parse (the structures that you put into your examples, all these parentheses, were done with a lot of intelligence on your part)." - i do not agree with this. to write this i just need to know this language and write, that is english and i already know it. binary tree editor could help to make this tree faster. i have also added request for binary tree tool in the Structured text, and additional explanations. in comparison, to write in your language, i need to learn that your language. and this binary tree structure is easier to parse than your complicated language. if you say about parsing from traditional text, then, it is possible to do it, and there is almost zero texts in your language yet. --QDinar (talk) 08:49, 4 September 2021 (UTC)[]
"and this binary tree structure is easier to parse than your complicated language" - probably, easiest way to parse this your new language is to parse it also into a binary tree form first, just like with natural languages, and then parse that binary tree. --QDinar (talk) 09:19, 4 September 2021 (UTC)[]
probably you are going to use a tree, not binary, while parsing your code. i think, there is one advantageous thing: list shown as list, instead of binary tree, is more easier to read by human, and seems a little easier to compute for computer. but that, lists for (ordered and unordered) lists, can be added also to this binary tree idea. --QDinar (talk) 16:45, 4 September 2021 (UTC)[]
"The other thing is that a lot of words in natural languages are ambiguous, which makes them hard to translate." - probably, your artificial language is also going to have some ambiguities. because get every meaning and you can divide that meaning into several cases. "The "sprengen" in "Ich sprenge den Rasen" is a very different "sprengen" than the one in "Ich sprenge die Party"." - this example has not been useful example to prove to me. in both cases it is about scattering something, is not it? if somebody causes a party to cancel before even 10% of its people have known about it is going to be held, is this word used? i suspect, it is not used in that case. --QDinar (talk) 09:11, 4 September 2021 (UTC)[]
and, it is possible to refer to submeanings of words with natural languages, like sprenge1, sprenge2, (maybe using meanings which shown in wiktionary). --QDinar (talk) 12:35, 4 September 2021 (UTC)[]
" although I have a grammar of Ithkuil on my table right now, and that might be an interesting candidate. One could argue whether that's natural, though " - according to wikipedia, it is a constructed language with no users. (ie artificial language). that artificial languages have few users. probably ithkuil have some problems, like limits of meaning are not established. since it has 0 users, when new people start to use it, they are going to change that limits. --QDinar (talk) 09:11, 4 September 2021 (UTC)[]

i saw that aim of your proposition is to write something once and to get it in multiple languages. with using structured text for all languages it is also possible, because one wikipedia can get a structure from another wikipedia. and, seems, it also can solve the "uzbek uncle" problem (they say there are different words for mother's brothers and for father's): if there are several languages with such "uncle"s, they can reuse structures, one wiki from another wiki. --QDinar (talk) 20:20, 19 September 2021 (UTC)[]

examples in Abstract Wikipedia/Examples/Jupiter do not have complex things like "Object_with_modifier_and_of", for example, "and" is shown as a single element. that complex thing is also discussed in Talk:Abstract_Wikipedia/Examples#Object_with_modifier_and_of and a version with more separated functions ("constructors") is proposed. so, if you go into that direction, you are going to almost copy english language, so, in direction to my proposition. --QDinar (talk) 20:20, 19 September 2021 (UTC)[]

also, i have seen somewhere, seems there was an example of how rendering works, (or, maybe, i saw it here: "<person> is (notable for) <role in activity>, rather than instance of (P31) human (Q5)...", by user:GrounderUK, i am not sure that he says about structure, maybe just template, (meaning, a string with places for arguments), and also in an early proposal it is clearly explained as templates), in process of rendering a text in a natural language, you go through phase of stucture of that language. and, i think, it is inavoidable. so, structures like i proposed are anyway going to be used, but you might use more traditional approach with grammatic parts of speach, cases, gender, etc, and, i think, i propose very simple and universal grammar... and you are (or was) going to have it inside wikifunctions, on-the-fly, or, maybe, cached, and by my proposition, they should stay with the languages' wikis. having that structures in respective wikis is (must be) very useful. --QDinar (talk) 21:13, 19 September 2021 (UTC), last edited 19 sep 2021, 21:49‎ utc.[]

additional arguments against the "Abstract Wikipedia" proposal:

1. languages have equal right, but you are or were going to put a language in a governing position. putting all languages in equal rights position should be just a principle, that should be accomplished/performed.

2. developing a new language seems is not in scope of aims of wikimedia.

3. seems, as i saw from the talk archives, you hope you can just divide meanings as much as it is needed to make them usable for all languages. (btw, i have just seen, 2nd meaning of "abstract" in wiktionary is "Something that concentrates in itself the qualities of a larger item, or multiple items."). but, i think, just division is not enough; borders of meanings are a little shifted in every natural language compared to other languages. so, it is not going to be possible easy to perfectly translate from the principal language to other languages with only that amount of precision in the principal language. and, i want to say, lexems are like stones or bricks of different forms, and when you tell an idea, the idea is like a castle. you can build castles of almost same form with different sets of bricks and stones, and every language is like a different set of bricks and stones. you need very small bricks to be able to resemble the castles built with different stones precisely enough to rebuild also their every stone. (for example, the additional data like "subject", "quality", "class", "location constraint" in coding of "Jupiter is the largest planet in the Solar System." in the "jupiter" example probably still does not give enough precision for the goal.)

4. esperanto is most widely used constructed language, according to wikipedia, created in 1887, it has now nearly 100000 speakers. it is still few compared to some natural languages. this your language is still a different language, not english, by its structure, even as shown in the "jupiter" example, (in the "san francisco" example it is more different). number of speakers of it also probably will, expected to, grow slowly. even more so (slow) if/since it is going to be so enormously precise, compared to natural languages, as i explained in the previous paragraph.

--QDinar (talk) 00:53, 20 September 2021 (UTC) last edited 05:30, 20 September 2021 (UTC)[]

3. even a language with so many tiny lexems would not be enough to be easily translatable to natural languages, because when an idea is built, it is like one castle from one set of stones, to show other language versions other castles should be built. --QDinar (talk) 04:50, 20 September 2021 (UTC)[]
such abstract language could be used to more exactly/precisely describe idea/reality, minimizing ambiguities given by language, but anyway, when that is translated into normal languages, they have to use their lexems.--QDinar (talk) 09:42, 21 September 2021 (UTC)[]
3. even human translators do not make perfect translation, but best possible translation. if they try to deliver exact meaning of original text, they have to make long translation, like explaining every lexeme. --QDinar (talk) 09:42, 21 September 2021 (UTC)[]

You have raised a lot of different points, and many arguments that I fully agree with. But given the structure of your replies, it is hard to answer them. Thanks for breaking a few of the questions out in answerable chunks below.

In the end, it is basically just templates that are being filled and that work across languages. Let's stick with the Superlative constructor we discuss below. With these four arguments one can create a lot of sentences - and all of them seem to be quite easily expressed in many different languages. May I propose we focus on this single example, in order to crystallize your criticism? --DVrandecic (WMF) (talk) 23:32, 24 September 2021 (UTC)[]

are sentences "z-objects"?Edit

are sentences planned to be "z-objects"? --QDinar (talk) 13:02, 17 September 2021 (UTC)[]

Yes, a sentence would be represented as a Z-Object. E.g. "Jupiter is the largest planet in the Solar System." would be represented as Superlative(subject: Jupiter, quality: large, class: planet, location constraint: Solar System) (and all of that would be ZIDs, so in real maybe something like Z19349(Q319, Z29393, Q634, Q544). -- DVrandecic (WMF) (talk) 22:42, 24 September 2021 (UTC)[]

will programming languages be converted to own language first?Edit

is wikifunctions planned to have its own language? is it something like lambda calculus? are all other supported languages planned to be converted to that own language first, before being interpreted? --QDinar (talk) 13:02, 17 September 2021 (UTC)[]

if there is an own language, i would like to see some small example code in it, like for fibonacci sequence. --QDinar (talk) 18:02, 19 September 2021 (UTC)[]

No. Code written in Python will be evaluated by a Python interpreter, code in JavaScript by a JavaScript interpreter, Code in C by a C compiler and then executed, etc. What little system we have to compose such function calls together will be on top of code in such programming languages, not a common runtime we convert everything to first. -- DVrandecic (WMF) (talk) 22:46, 24 September 2021 (UTC)[]

what api will be used?Edit

is http-web api planned for wikifunctions? is the functions planned to be called through web api or other way? --QDinar (talk) 13:02, 17 September 2021 (UTC)[]

We offer already a Web API to call functions from Wikifunctions, see https://notwikilambda.toolforge.org/w/api.php?action=help&modules=wikilambda_function_call. We might not always use the API for our internal use cases (e.g. Wikipedia calling a function might not go through an HTTP request), it depends on what is efficient. -- DVrandecic (WMF) (talk) 22:49, 24 September 2021 (UTC)[]

why not to use ready programming language implementations?Edit

what do you think about idea to just use usual programming language interpreters? ie code can be in wiki page, and in can be runned. some dangerous functions can be removed or turned off in order to save from hacking/vandalism. --QDinar (talk) 13:02, 17 September 2021 (UTC)[]

Great idea - and we do that! Python code is run by the standard Python implementation, JavaScript by Node, etc. The way we hope to avoid dangerous functions is by running them in their own containers with limited resources and no access to the outside world. The architecture is described here. -- DVrandecic (WMF) (talk) 22:51, 24 September 2021 (UTC)[]

what are z-ids?Edit

what is origin of "z" letter in z-ids? are there already z-ids in wikidata? as i understood, z-ids just replace multiple natural language strings, is it so? if it is so, why function names like "Object_with_modifier_and_of" also not replaced with them? the code in right block in https://notwikilambda.toolforge.org/wiki/Z10104 is hard to understand. are the z-codes in it planned to be replaced with natural language strings? --QDinar (talk) 13:02, 17 September 2021 (UTC)[]

You are totally right! "Object_with_modifier_and_of" is just the English name of the function, in reality it is identified by a ZID, and it will have a different name in Arabic, and in Russian, and in German, and in Tatar, etc. That is true for all of our functions. The "and" function is called "i" in Croatian, but the ZID in Notwikilambda is Z10026. In the User Interface though, we will try to hide all ZIDs, and instead display the names in your language. -- DVrandecic (WMF) (talk) 22:55, 24 September 2021 (UTC)[]

response to an old reply (in the archives) about MLEdit

in Talk:Abstract_Wikipedia/Archive_2#Wikidata_VS_Wikipedia_-_General_structure user:DVrandecic (WMF) said:

1. "we don't have automatic translation for many languages"

2. "The hope is that Abstract Wikipedia will generate content of a consistently high quality that it can be incorporated by the local Wikipedias without the necessity to check each individual content. .... translation doesn't help with updates. If the world changes, and the English Wikipedia article gets updated, there is nothing that keeps the local translation current."

i want to say that:

1. i saw news saying yandex has developed machine translation for bashkir language, using tatar language, because this languages are very similar, and there are more content in tatar. (so, more languages may appear, (in ML), using such tehniques).

2.

i doubt that the way you are going to use is going to provide more stable results than the ML. users will constantly edit the functions, renderers, constructors, the abstract code, and probably something is going to brake also. so, an idea have come just in my mind: in case of editing a renderer, if all cases of using that renderer are linked to it, editor may check all the use cases before applying changes... if that use cases are not too many...

it is possible also to make ML automatical updates more easy to check. if after one or several updates some user is notified with them, and given an easy to read page showing differences that are made in original page, and differences that are going to be made in translation.

though, there is a stronger argument against ML in this forum in Talk:Abstract_Wikipedia/Archive_3#Might_deep_learning-based_NLP_be_more_practical? by user:Stevenliuyi: "a ML-based system will make sentences more fluent, it could potentially turn a true statement into a false one".

--QDinar (talk) 23:11, 19 September 2021 (UTC)[]

I very much look forward and hope that more high quality machine translation will become available for everyone. I think there's a window of opportunity where Wikifunctions / Abstract Wikipedia will provide knowledge in high quality in languages where machine translation will not yet.
I like the idea of showing the results when editing. Our first implementation of that idea is to do that with the testers. But as the system develops, we will probably be able to use more of the system to let the contributors understand the impact of their edits. -- DVrandecic (WMF) (talk) 23:16, 24 September 2021 (UTC)[]

Boilerplate functionsEdit

From my point of view Scratch is a good example for a low-coding-plattform. It is easy possible on it to create a program. In Scrath there are Boilerplates with gaps and it is possible with drag and drop to take the different parts that have different looks and it is possible to connect the parts to a function if they can belong together. From my view for at least some functions that principle could be a possibilty with a lower barrier for creating functions after the boiler plate templates could be translated into other languages to reach people with lower coding knowledge. Making it for them possible to create a function. Have you thinked about offering a possibility like that in the User Interface.--Hogü-456 (talk) 20:42, 25 September 2021 (UTC)[]

Return to "Abstract Wikipedia" page.