Structured Comments, Attributes and DecoratorsEdit
Structured comments, attributes and decorators could be utilized by Wikifunctions to provide features such as: function metadata, facilitating searching for functions, and the automatic generation of documentation. Structured comments are programming language comments which utilize syntactic patterns or XML so that the contents of the comments can be mechanically processed.
Attributes can be attached to functions and their parameters in programming languages such as C# and Java.
Utilizing structured comments, attributes, or decorators, versioning-related metadata can simplify versioning scenarios on evolving crowdsourced resources.
Namespaces and ModulesEdit
Namespaces and modules can be useful when organizing large collections of functions. With namespaces or modules, multiple paradigms or ecosystems of functions could more readily coexist in a crowdsourced resource.
Scripting Environments for Natural Language GenerationEdit
With modern scripting engines such as V8, it is relatively easy to create and provide scripting environments.
Resembling how Web browsers provide scripting environments and API for Web scenarios, we can envision providing scripting environments and API for natural language generation scenarios.
Discussion topics pertaining to scripting environments for renderers include: (1) API and object models for accessing and working with input Wikidata data, (2) API and object models for accessing and working with the rendering context, (3) API and object models for accessing and working with intermediate knowledge representations, (4) API and object models for generating output natural language content.
As Wikidata is a sourced knowledgebase, API and object models should include means for annotating any intermediate representations and portions of natural language with sources. In automatically-generated articles, statements’ sources could appear as referenced materials in articles’ “References” sections with numbered citations appearing inline, near relevant content.
Should Wikidata come to include support for automated reasoning, any reasoning, argumentation, derivations and/or proofs supporting statements could similarly appear in articles’ “References” sections with numbered citations appearing inline, near relevant content. Readers could click on hyperlinks to navigate to automatically-generated documents which indicate supporting reasoning, argumentation, derivations and/or proofs for one or more statements.
Output Streams, Logging and Diagnostic EventsEdit
When editing/developing Wikifunctions content for use on Abstract Wikipedia, it would be convenient to be able to output to multiple streams, to log, and/or to raise typed events. Such features are part of the scripting environment provided to functions.
It would also be useful to be able to aggregate, organize and view diagnostic outputs with a configurable granularity or verbosity.
Editors/developers could have a means of toggling a “developer mode” or “debugging mode” on Abstract Wikipedia so that they could, while viewing articles, either:
- hover over portions of natural language to view relevant traces of computation and diagnostic messages in hoverboxes,
- view visual indicators for traces of computation and diagnostic messages in a margin so that they could then interact with the visual indicators to view expanded data, or
- otherwise select or indicate portions of natural language content to view relevant traces of computation and diagnostic messages.
We can consider adding feedback mechanisms for Abstract Wikipedia readers such as commenting upon, liking, upvoting, or otherwise providing feedback with respect to specific portions of automatically-generated natural language content.
Also possible is that readers could “post-edit” automatically-generated content. For automatically-generated articles, there could be wiki versions of the articles for purposes of crowdsourcing the fine-tuning of the articles. These “wiki post-edited” versions of automatically-generated articles could be navigated to via tab user-interface elements. Data from this variety of crowdsourced feedback on automatically-generated articles, “wiki post-editing”, could be collected and aggregated for use by Wikifunctions editors/developers.
The Automatic Evaluation of Natural LanguageEdit
Software tools in the categories of automatic essay scoring, grammar checking, readability measurement, and/or natural language evaluation could be of use for automatically measuring articles in a number of ways. Coh-Metrix 3.0, for instance, measures natural language on 108 indices.
Perhaps bots could measure articles as they are updated and report their data to editors/developers using a platform API.
Generating Articles in Response to Users’ QuestionsEdit
Resembling question-answering systems, articles could be generated for Wikidata queries or after users navigate to articles from Web searches. Beyond highlighting relevant content, articles could be generated while utilizing this context data. — AdamSobieski (Discussion) 22:15, 15 July 2020 (UTC)
Generating Follow-up Questions for Use in ArticlesEdit
Resembling hypertext-based dialogue systems, one could place follow-up questions which might interest a reader in a section near the bottom of articles, each being a hyperlink to another article. One could hyperlink to articles which could be dynamically generated, if they are not already created and cached. — AdamSobieski (Discussion) 22:15, 15 July 2020 (UTC)
Speech Synthesis and HypertextEdit
There exists a CSS Speech Module W3C Candidate Recommendation.
With respect to pronunciation, one can utilize pronunciation lexicons with hypertext documents. Also, resembling EPUB3, one can utilize SSML-based attributes on generated hypertext outputs to provide pronunciation data.
Use GPT-3 style API'sEdit
Wikifunctions should support function contracts, as already provided by different programming languages (like Eiffel, Spark-Ada, or JML; or even Python with different packages), that is:
- Preconditions: A new section after the function arguments with a list of boolean predicates indicating the conditions required by the arguments before calling this function (e.g. list cannot be empty, or first parameter must be greater than the second one)
- Postconditions: A new section after the function arguments with a list of boolean predicates indicating the conditions that are guaranteed by the result of the function (e.g. result is between zero and the first parameter, or the length of the output list is the length of the input list plus one)
- Type invariants: A new section in the data type page with a list of boolean predicates indicating the conditions met by every value of this data type (e.g. value must be strictly greater than zero, or is a prime number)
Probably the simplest approach is that each predicate is just a function in the same namespace as the rest of the functions. In programming languages usually extra operations are allowed in contract predicates (like the "for all" or "there is at least one" quantifiers), but this may be optional. In pre-conditions it would be required just to reference the function arguments, and in post-conditions a way to reference the function result will be needed. As I understand that arguments cannot be modified, there is no need to reference in postconditions the original value of a parameter at function start. All predicates of the list must be true (i.e. a logical and of all the predicates), and preconditions can be as detailed as needed, probably being also useful for renderers, e.g. argument must be an wikidata item of a human, and also already dead.
Besides formally documenting the function for implementers and users (e.g. if a precondition fails, the user will get a unified and clear error message, without the need to handle the different errors inside each function implementation), postconditions are very useful for automatically checking the results during tests, and would be nice that the platform generates a report with constraints violations for each implementation in case a result doesn't fulfill the postcondition with a set of input parameters (so the implementation or postcondition can be corrected). Type invariants would be implicit function preconditions of every function with arguments of that type, and implicit postconditions of every function with a result of that data type.
Other advantages usually obtained is the potential to simplify implementations because some defensive code can be reduced thanks to the preconditions, and avoids the need to handle exceptional situations in a compatible way between all supported languages. Maybe "robustness tests" should also be provided, i.e. some special tests for a function checking that some parameters are not allowed by the current preconditions of the function.