Talk:Abstract Wikipedia/Function evaluator call

@DVrandecic (WMF): This assumes that all functions have an identity that must be a valid string such as Zxxx. This is not always the case (e.g. a function created on-the-fly). There are some possible solutions:

Use "instantiate a function, and wrap it into a Z7" method. See Talk:Abstract_Wikipedia/Function_model#Make_Z7_more_uniform.
Make "key id" a string. (in the example below, the key ID is written in upper case, to prevent confusion with label.)

example

{
 "type": "Function call",
 "function": {
  "type": "Function",
  "arguments": [
   {
    "type": "Argument declaration",
    "argument type": "Positive integer",
    "key id": "LEFT",
    "label": {
     "type": "Multilingual text",
     "texts": [
      {
       "type": "Monolingual text",
       "language": "English",
       "text": "left"
      }
     ]
    }
  },
  {
   "type": "Argument declaration",
   "argument type": "Positive integer",
   "key id": "RIGHT",
   "label": {
    "type": "Multilingual text",
    "texts": [
     {
      "type": "Monolingual text",
      "language": "English",
      "text": "right"
     }
    ]
   }
  }
 ],
 "return type": "Positive integer",
 "testers": [],
 "implementation": [
  {
   "type": "implementation",
   "code": {
    "type": "Code",
    "language": "Javascript",
    "code": "_ = LEFT + RIGHT"
   }
  }
 ],
 "identity": "add"
 },
 "parameters": [
   {
    "type": "parameter",
    "parameter ID": "LEFT",
    "parameter value": {
      "type": "Positive integer",
      "value": "2"
    }
   },
   {
    "type": "parameter",
    "parameter ID": "RIGHT",
    "parameter value": {
      "type": "Positive integer",
      "value": "2"
    }
   },
  ]
]
}

{
 "Z1K1": "Z7",
 "Z7K1": {
  "Z1K1": "Z8",
  "Z8K1": [
   {
    "Z1K1": "Z17",
    "Z17K1": "Z70",
    "Z17K2": "LEFT",
    "Z17K3": {
     "Z1K1": "Z12",
     "Z12K1": [
      {
       "Z1K1": "Z11",
       "Z11K1": "Z251",
       "Z11K2": "left"
      }
     ]
    }
   },
   {
    "Z1K1": "Z17",
    "Z17K1": "Z70",
    "Z17K2": "RIGHT",
    "Z17K3": {
     "Z1K1": "Z12",
     "Z12K1": [
      {
       "Z1K1": "Z11",
       "Z11K1": "Z251",
       "Z11K2": "right"
      }
     ]
    }
   }
  ],
  "Z8K2": "Z90",
  "Z8K3": [],
  "Z8K4": [
   {
    "Z1K1": "Z14",
    "Z14K3": {
     "Z1K1": "Z16",
     "Z16K1": "Z301",
     "Z16K2": "_ = LEFT + RIGHT"
    }
   }
  ],
  "Z8K5": "Z144"
 },
 "Z7K3": [
   {
    "Z1K1": "Zxx",
    "ZxxK1": "LEFT",
    "ZxxK2": {
      "Z1K1": "Z70",
      "Z70K1": "2"
    }
   },
   {
    "Z1K1": "Zxx",
    "ZxxK1": "RIGHT",
    "ZxxK2": {
      "Z1K1": "Z70",
      "Z70K1": "2"
    }
   },
  ]
}

Variant: ZxxxK1/parameter ID may be omitted and it will automatically match a corresponding argument. This is similar to how named parameter is handled in Python.

Introduce new type "symbol" (see: w:Symbol (programming)) whose labels is translatable. Usually the label should be omitted (should we allow local override?) The symbols mean nothing other than symbols themselves. Compare "Atoms" or "Symbol" in other programming languages such as Ruby.

example (assuming we defined two symbols at Z10000 and Z10001)

{
 "type": "Function call",
 "function": {
  "type": "Function",
  "arguments": [
   {
    "type": "Argument declaration",
    "argument type": "Positive integer",
    "key id": "Z10000"
  },
  {
   "type": "Argument declaration",
   "argument type": "Positive integer",
   "key id": "Z10001"
  }
 ],
 "return type": "Positive integer",
 "testers": [],
 "implementation": [
  {
   "type": "implementation",
   "code": {
    "type": "Code",
    "language": "Javascript",
    "code": "_ = Z10000 + Z10001"
   }
  }
 ],
 "identity": "add"
 },
 "parameters": [
   {
    "type": "parameter",
    "parameter ID": "Z10000",
    "parameter value": {
      "type": "Positive integer",
      "value": "2"
    }
   },
   {
    "type": "parameter",
    "parameter ID": "Z10001",
    "parameter value": {
      "type": "Positive integer",
      "value": "2"
    }
   },
  ]
]
}

{
 "Z1K1": "Z7",
 "Z7K1": {
  "Z1K1": "Z8",
  "Z8K1": [
   {
    "Z1K1": "Z17",
    "Z17K1": "Z70",
    "Z17K2": "Z10000"
   },
   {
    "Z1K1": "Z17",
    "Z17K1": "Z70",
    "Z17K2": "Z10001"
   }
  ],
  "Z8K2": "Z90",
  "Z8K3": [],
  "Z8K4": [
   {
    "Z1K1": "Z14",
    "Z14K3": {
     "Z1K1": "Z16",
     "Z16K1": "Z301",
     "Z16K2": "_ = Z10000 + Z10001"
    }
   }
  ],
  "Z8K5": "Z144"
 },
 "Z7K3": [
   {
    "Z1K1": "Zxx",
    "ZxxK1": "Z10000",
    "ZxxK2": {
      "Z1K1": "Z70",
      "Z70K1": "2"
    }
   },
   {
    "Z1K1": "Zxx",
    "ZxxK1": "Z10001",
    "ZxxK2": {
      "Z1K1": "Z70",
      "Z70K1": "2"
    }
   },
  ]
}

Similarly, ZxxxK1/parameter ID may be omitted.

(We should consider whether to support symbols that are created on-the-fly and not persistent.)

Only allow one parameter in the function and let functions themselves extract parameters. for example an "add" is implemented as y=x[0]+x[1], where x is the only parameter.--GZWDer (talk) 17:55, 1 February 2021 (UTC)Reply

This is a great point. For now, with the pre-generic model, it is indeed assuming that functions are not created on the fly.

The function evaluator should also only run functions that are written in code. So I am not sure we would be running functions that are generated on the fly. I should document that in the page.

But even if that were not the case, the keys could just be K1 and K2, as in the following example (the changes are subtle):

onlykey

{
 "type": "Function call",
 "function": {
  "type": "Function",
  "arguments": [
   {
    "type": "Argument declaration",
    "argument type": "Positive integer",
    "key id": "K1",
    "label": {
     "type": "Multilingual text",
     "texts": [
      {
       "type": "Monolingual text",
       "language": "English",
       "text": "left"
      }
     ]
    }
   },
   {
    "type": "Argument declaration",
    "argument type": "Positive integer",
    "key id": "K2",
    "label": {
     "type": "Multilingual text",
     "texts": [
      {
       "type": "Monolingual text",
       "language": "English",
       "text": "right"
      }
     ]
    }
   }
  ],
  "return type": "Positive integer",
  "testers": [],
  "implementation": [
   {
    "type": "implementation",
    "code": {
     "type": "Code",
     "language": "Javascript",
     "code": "K0 = K1 + K2"
    }
   }
  ],
  "identity": { ... }
 },
 "left": {
  "type": "Positive integer",
  "value": "2"
 },
 "right": {
  "type": "Positive integer",
  "value": "2"
 }
}

{
 "Z1K1": "Z7",
 "Z7K1": {
  "Z1K1": "Z8",
  "Z8K1": [
   {
    "Z1K1": "Z17",
    "Z17K1": "Z70",
    "Z17K2": "K1",
    "Z17K3": {
     "Z1K1": "Z12",
     "Z12K1": [
      {
       "Z1K1": "Z11",
       "Z11K1": "Z251",
       "Z11K2": "left"
      }
     ]
    }
   },
   {
    "Z1K1": "Z17",
    "Z17K1": "Z70",
    "Z17K2": "K2",
    "Z17K3": {
     "Z1K1": "Z12",
     "Z12K1": [
      {
       "Z1K1": "Z11",
       "Z11K1": "Z251",
       "Z11K2": "right"
      }
     ]
    }
   }
  ],
  "Z8K2": "Z90",
  "Z8K3": [],
  "Z8K4": [
   {
    "Z1K1": "Z14",
    "Z14K3": {
     "Z1K1": "Z16",
     "Z16K1": "Z301",
     "Z16K2": "K0 = K1 + K2"
    }
   }
  ],
  "Z8K5": { ... }
 },
 "K1": {
  "Z1K1": "Z70",
  "Z70K1": "2"
 },
 "K2": {
  "Z1K1": "Z70",
  "Z70K1": "2"
 }
}

It is not perfect due to the problems you link to in the more uniform representation of Z7s, but I think that can work. What do you think? It is basically like your first solution, but using K1 and K2 instead of string constants LEFT and RIGHT. --DVrandecic (WMF) (talk) 22:20, 1 February 2021 (UTC)Reply

The normalized type edit

Latest comment: 2 years ago17 comments2 people in discussion

@DVrandecic (WMF):Apologies if this makes no sense! :) --GrounderUK (talk) 14:57, 9 June 2021 (UTC)Reply

We say that the Z4/Type of every function evaluator call must be Z7/Function call, and I am not sure this should be true after normalization, as I said on phab:T277913. This is because the evaluation of a function call is not an object of type Z7; it is an object of the type given by the embedded Z8/Function’s Z8K2/return type.

For the example on the main page, I propose we should replace

{
"type": "Function call",
"function": {
 "type": "Function"
 ...}
...}

with

{
"type": "positive integer",
"value": {
 "type": "Function call",
 "function": {
  "type": "Function"
  ...}
 ...}
...}

That is, the Z7/Function call is embedded in an object of the expected type. Perhaps, though, we should do this only for function calls that are embedded in other objects (including function calls).--GrounderUK (talk) 09:54, 3 April 2021 (UTC)Reply

It might be clearer to say that a function call is not an unevaluated value within an object but rather it is an object within which a value is unevaluated. That means we call the evaluator with the object we expect to get back. Looked at this way, it is not the function call that is embedded; it is the unevaluated value (or, strictly, an object representing the unevaluated value). So we change the label of Z7 to “unevaluated” (for example):

{
"type": "positive integer",
"value": {
 "type": "unevaluated",
 "function": {
  "type": "Function"
  ...}
 ...}
...}

--GrounderUK (talk) 10:27, 18 April 2021 (UTC)Reply

I think you are right, but I am not sure what change you are proposing. Yes, the type is not the function call, but the value resulting from that function call. The assumption is that, given functional transparency, these are effectively the same. This is not 100% true, but good enough.

And yes, you are right, it is not just any function call, but it has to be a function call that evaluates to a type, or else this would be invalid. -- DVrandecic (WMF) (talk) 21:35, 3 September 2021 (UTC)Reply

@DVrandecic (WMF): Well, I admit to being confused myself! You don’t think I am talking about generic types, do you? I’m not. I am talking about the structure of a ZObject that requires evaluation. According to the main page, such an object must have {"Z1K1": "Z7"...}, which I find unhelpful. Instead, the type of such an object should be the return type of the function to be evaluated. In this case, then, the ZObject is a positive integer, and the value of the positive integer is whatever the function evaluates to. With this approach, incidentally, we have no need to “Allow Normalized ZObjects to Be Z7s” (phab:T277913 closed, invalid). Whether unnormalized Z7s should be supported is a separate question.--GrounderUK (talk) 21:42, 5 September 2021 (UTC)Reply

@GrounderUK: Ah, thank you! That clarifies things a lot. Yes, I was indeed thinking you were talking about generic types.

Regarding Z7s, I see your point, and it might indeed be helpful to add the expected return type to a Z7 (or even explicitly type it as the expected return type instead of typing it as a Z7). In that case it would still be useful to have some trigger to say "this needs to be evaluated" instead of "this is a literal", I guess?

Right now, you can always look into the Z8K1/return type of the Z7K1/function of the function evaluation, and get that type. So it's quite close anyway. What do you think? --DVrandecic (WMF) (talk) 22:36, 24 September 2021 (UTC)Reply

@DVrandecic (WMF): I prefer the explicit typing because then evaluated, unevaluated and literal values have consistent representations and are fully interchangeable. As it stands, if I wanted to change 2+2 to 2+(1+1), for example, I would have to change the type of the second argument from integer to Z7. In my suggested approach, however, the arguments are both always positive integers (a Z1 with {"Z1K1": "Z70"...}). In this case, it is pretty obvious whether the Z70K1 value is a literal or a Z1 that requires evaluation. If it is a literal, we won’t know whether the Z70 results from a previous evaluation. Similarly, we do not currently know whether a Z1 has been normalized. Overt state markers may be useful here but that is a whole other topic. Thinking about it, though, the examples above represent a hybrid approach, because the value paired with the Z70K1 key is a Z7. Perhaps it ought really to be a Z6 whose Z6K1 is a Z7? In any event, "Z7" continues to indicate that evaluation is required, it is just moved to refer only to the value paired with the Z70K1 key. That is why I also suggested changing the English label for "Z7" to "unevaluated".--GrounderUK (talk) 17:11, 26 September 2021 (UTC)Reply

@GrounderUK: I have to admit that I read this a couple of times in the last few weeks, and repeatedly got a bit lost, sorry.

I try to restate what I think I understand: you would prefer each value, whether it is a literal or a function call, to be always typed as the resulting type.

Whether a value needs to be evaluated or not, we would decide on the inner representation of the value.

If I understand it so far, one question before proceeding: what if we don't know what the resulting type is, e.g. when using head on a list of unspecified type? Would the type then be Z1? -- DVrandecic (WMF) (talk) 20:53, 10 December 2021 (UTC)Reply

@DVrandecic (WMF):I struggle to understand it myself, sorry! During normalization (if not before), a function call is interpreted as being (and re-written as) a Z1/ZObject of the type that would be returned by the evaluator (within the response object). I would therefore assume that the type would be defined by the Z8/Function’s Z8K2/return type. So it would be a Z1/any if and only if “Z1” is the literal value associated with the Z8K2 key. A “head” function may be an interesting exception to that simple rule, however. If “head” has a Z1 return type but can accept a typed list as input, then we know that the result must, in fact, have the type required by the list. So we could say that the as-yet-unevaluated object has the same type as the list and only in the case of a list of unspecified type would that type be “Z1”. That is logical enough (to my mind) but it is just a particular case of type inference, and how useful it might be, I cannot say. So, yeah, keeping it simple for the sake of argument: I think we do “know” what the resulting type is, it’s the Z8K2 value (even if that is a “known unknown” represented by “Z1”).--GrounderUK (talk) 00:12, 11 December 2021 (UTC)Reply

Ok, that confirms that I at think that I understood what you meant, I think. But I am not sure I understand the advantage. Because it would just explicate the expected result type, which is available by looking up the return type of the function anyway. Is it to save that lookup? -- DVrandecic (WMF) (talk) 00:43, 11 December 2021 (UTC)Reply

@DVrandecic (WMF):No, it’s not that simple. I think it goes back to the fundamental goal of normalization, which is consistency. My recollection is that I was uncomfortable with phab:T277913 because it proposed treating fundamentally equivalent representations as if they were different. My view is that normalization should bring such equivalences to the surface (if that’s not too much trouble). From a purely practical point of view, that avoids having to say that a value has to be of a particular type, or be a function that returns a value of that type, or be a reference to an object representing a value of that type, or be a Z1/any (so long as the Z1 is not incapable of resolving to a value of the particular type). And that brings us back to en:type inference. It seems to me that nesting and batching functions will obscure the true types of values (just as their inclusion in a list may). Whether the true (underlying) type is automatically inferred or simply asserted, my preferred representation allows the object to be clear about what type of value it represents. That in turn allows the evaluator to detect an error in the case where the resultant value is not (or cannot be converted into) a value of the expected type. (Equivalently, it would allow the required conversion to be made explicit during normalization.)--GrounderUK (talk) 19:45, 12 December 2021 (UTC)Reply

Sorry, still just reformulating: so should { type: Number, value: 4 } be normalized to the same value as { type: function call, function: addition, left: { type: number, value: 2 }, right: { type: number, value: 2 } }? -- DVrandecic (WMF) (talk) 19:57, 23 December 2021 (UTC)Reply

No problem. The way I see it, both are ways to represent numbers. So it is { type: Number…} in both cases. In the first case, we have an explicit value: the literal string “4”. In the second case we have a different representation that will evaluate to the literal string “4”. In both cases, the literal string “4” (in context) represents the same value: the number 4. For the second case, we would get

{ type: Number, value: { type: function call, function: addition, left: { type: Number, value: “2”}, right: {type: Number, value: “2”}

. And in the case where the number 4 is represented by 2+(1+1), we would have

{ type: Number, value: { type: function call, function: addition, left: { type: Number, value: “2”}, right: { type: Number, value: { type: function call, function: addition, left: { type: Number, value: “1”}, right: {type: Number, value: “1”}}

. (Just to pick up on something I said earlier, but not to pursue it, we might represent the number 4 as { type: Number, value: { type: String, value: “4”}}. It would certainly seem appropriate for that representation to have the same normalization as { type: Number, value: “4”})--GrounderUK (talk) 22:02, 23 December 2021 (UTC)Reply

I agree that { type: Number, value: { type: String, value: “4”}} and { type: Number, value: “4”}) should have the same normalization (and that's also the case already).

The type of the value on the key value of the type Number is a string, so in your example, the result of a function call to add would need to be a string instead of a Number, which is possible, but would feel wrong. Did I miss something? -- DVrandecic (WMF) (talk) 22:57, 23 December 2021 (UTC)Reply

Ultimately, the value in the key–value pair is a string, but a “function call” does not return a value (or a key–value pair) it returns a value-object. This is why I suggested we might re-label “Z7”. The function call is the value-object containing the unevaluated (string) value and the result of the function call is the value-object containing the evaluated (string) value (a literal). The Z4/Type applies to the value-object (Z1/ZObject) not the value in the key–value pair. So, whether or not its (string) value is evaluated, the value-object is (in this case) a number. I don’t know if we have a way to refer to the “type” of a value in a key–value pair (String | List | Record sense of ZObject), but I see the “type” of an unevaluated value as “Record” (i.e., Z1/ZObject). We could insist that the Z1 has to be a Z6, in the first instance, but I think that just leads to an infinite nest of Z6s.--GrounderUK (talk) 00:46, 24 December 2021 (UTC)Reply

Re "I don’t know if we have a way to refer to the “type” of a value in a key–value pair": the key definition on a Z4 should tell you the type of the value in a key-value pair.

OK, I think I am starting to understanding. Would you do the same for a reference, i.e. tell the type explicitly, instead of giving it as a Z9?

This way we would for each object always know the type, even if the object then is given as a literal, or as a reference, or as a function call, right? --DVrandecic (WMF) (talk) 21:41, 9 February 2022 (UTC)Reply

Ah, sorry, @Denny, I didn’t see your reply till now. I am not 100% certain that there are no exceptions, but yes, I think that objects containing a reference should generally have the type that they will acquire or retain after substitution. Curiously, though, Z9s seem a bit redundant here, because a simple reference string value would become a properly typed object. For example, {type: Number, value: “4”} becomes {type: {type: Type, identity: {type: Reference, reference id: Number}, value: “4”}. But is this any more explicit than {type: {type: Type, identity: Number}, value: “4”}? I suppose I believe that a key that demands a Z2K1 value should not have string as its value type. Then a string value in such a pair must be a Z2K1/id rather than a Z6/string. In other words, we could (just possibly) have no need for reference objects, as such, just objects of the appropriate type for the referenced object (referenced in a key value that requires a Z2K1/id, according to its Z3K1/value type). GrounderUK (talk) 17:56, 4 April 2022 (UTC)Reply

Add topic