Abstract Wikipedia/Local vs global keys

This page is currently a draft. More information pertaining to this may be available on the talk page.

Translation admins: Normally, drafts should not be marked for translation.

Context

There are some questions regarding local and global keys, in particular w.r.t. generic type:

The topic also appears when working on scoping as it is relevant for higher-order functions (e.g. T309195).

It turns out there are multiple uses of local keys and subtle differences between them. I therefore thought it would be a good idea to summarize the current state. I hope that helps guide the discussion on which ones we want to support.

Some background on local keys can be found here: Abstract Wikipedia/Local keys

Local keys as arguments to function calls

Global objects are generally required to have a global ZID e.g. Z10000. Their fields should have global keys that match the ZID of their Z1K1/type, For example, if an object has type Z10000, then its fields should be Z10000K1, Z10000K2, etc.

{
   Z1K1/type: Z10000
   Z10000K1/first field: ...
   Z10000K2/second field: ...
}

When calling a global function, we also provide its arguments with global keys. Example:

{
   Z1K1/type: Z7/function call,
   Z7K1/function: Z10000,
   Z10000K1/first argument: …,
   Z10000K2/second argument: …
}

Providing the arguments using local keys works too. This is equivalent:

{
   Z1K1/type: Z7/function call,
   Z7K1/function: Z10000,
   K1: …,
   K2: …
}

The need for local keys arises from higher order functions and generic types. If a Z10000 function (say the sorting function) takes another function as a Z10000K1 argument (say a comparison function), then there is no way to know in advance which function will be given and so what its keys look like. We could provide either the Z30001 function (that compares numbers) or the Z30002 function (that compares numbers in the reverse order, to produce a reverse sorted list), or the Z30003 function (that compares strings, if we’re sorting a list of strings).

Therefore, we cannot know whether the argument’s name is Z30001K1 or Z30002K1 or Z30003K1, and we have to call the function using local keys. In other words, this would be wrong:

…
{
   Z1K1/type: Z7/function call,
   Z7K1/function: {Z1K1: Z18/argument reference, Z18K1/id: Z10000K1},
   Z30001K1: …
}
…

We might not call Z10000 with Z30001 as an argument but with Z30002 or Z30003 as an argument. The following is correct way to write this:

…
{
   Z1K1/type: Z7/function call,
   Z7K1/function: {Z1K1: Z18/argument reference, Z18K1/id: Z10000K1},
   K1: …
}
…

In fact we could always use this schema when calling functions, i.e. always use local keys and never global keys! We could even enforce this to be the only way to call functions. It would not even be much different from how function calls normally work in most programming languages, that is by referring to their position: K1 refers to the first argument of the function, whatever its name, K2 to the second, etc.

There might still be value in knowing the global keys, in particular for labeling in the UI (I am not sure exactly how that works). However that information should already be in the Z7K1 object, if at all (for example that information would not be available in the context of higher-order functions). In fact, that’s one of the reasons for generic function types: the information about the argument keys would always be available.

Local keys as argument IDs inside local functions

A similar question arises when defining (and/or returning) local functions, e.g. for currying, since a local function by definition does not have a global ZID. In that case, we could use local keys in the argument declarations:

{
   Z1K1/type: Z8/function,
   Z8K1/argument declarations: [
     Z17,
     {Z1K1/type: Z17/argument declaration, …, Z17K2/key id: “K1”}
   ],
   Z8K4/implementations: …
   …
}

And then we can use the local key to refer to the argument in the implementation of the function:

{
   …
   Z8K4/implementations: {
       …
       {Z1K1/type: Z18/argument reference, Z18K1: K1}
       …
   }
}

However, it is unfortunately not always possible to do that! In particular when defining nested local functions, because it is not possible to differentiate between the arguments of each function. With the current semantics, an argument reference that uses a local key refers to the inner function only!

{
   Z1K1/type: Z8/function,
   Z8K1/argument declarations: [
     Z17,
     {Z1K1/type: Z17/argument declaration, …, Z17K2/key id: “K1”}
   ],
   Z8K4/implementations: {
       …
       {
           Z1K1/type: Z8/function,
           Z8K1/argument declarations: [
             Z17,
             {Z1K1/type: Z17/argument declaration, …, Z17K2/key id: “K1”}
           ],
           Z8K4/implementations: {
               …
               // This K1 refers to the argument of the inner function.
               {Z1K1/type: Z18/argument reference, Z18K1: K1}
               // How to refer to the K1 of the outer function?
               …
           }
       }
       …
   }
}

This problem is also related to the notion of identity, which is a field (Z8K5) that contains the global ZID of a function and that is supposed to indicate the true identity of that function, but which doesn’t work for local functions (at least in its current form).

An alternative is to use bogus ZIDs in the local function. For example, the previous function could be written as:

{
   Z1K1/type: Z8/function,
   Z8K1/argument declarations: [
     Z17,
     {Z1K1/type: Z17/argument declaration, …, Z17K2/key id: “Z10001K1”}
   ],
   Z8K4/implementations: {
       …
       {
           Z1K1/type: Z8/function,
           Z8K1/argument declarations: [
             Z17,
             {Z1K1/type: Z17/argument declaration, …, Z17K2/key id: Z10002K1}
           ],
           Z8K4/implementations: {
               …
               // This refers to the argument of the inner function.
               {Z1K1/type: Z18/argument reference, Z18K1: Z10002K1}
               // This refers to the argument of the outer function.
               {Z1K1/type: Z18/argument reference, Z18K1: Z10001K1}
               …
           }
           Z8K5/identity: Z10002
       }
       …
   }
   Z8K5/identity: Z10001
}

This works and has been successfully tested in the current implementation of the orchestrator! We can even use that bogus ZID as an identity for the local function! This solution might be a bit weird, since the ZID is not really a “proper” one; it is not really associated with the object in the DB, and might in fact already be used by another object (but it would still work!). But it is not completely crazy, and even makes sense in that we are giving a name to the local function, which refers to that function in the context of that function, even if it might be locally shadowing that of another global object.

A question that arises is therefore: do we allow such usage of ZIDs, and/or or how do we address the problem of nested local functions otherwise? One possibility is to not support nested local functions, as one level of local functions is enough to express all functions (although understandably more cumbersome as one has to extract functions to the top-level).

On the use of ZKeys for arguments in functions

Note that a lot of this stems from the fact that we are overloading ZKeys for identifying both object fields and function arguments.

Notice that nowhere in the definition of a Z8/function do we use that ID as an actual key! The ID only appears as a string in Z17/argument declarations and in Z18/argument references. The only place where these IDs are used as actual field keys is in function calls. But this could be be avoided by providing the arguments as a list instead:

{
   Z1K1/type: Z7/function call,
   Z7K1/function: Z10000,
   Z7K2/arguments: [
     {
         argument name: “Z10000K1”,
         argument value: …
     },
     {
         argument name: “Z10000K2”,
         argument value: …
     }
   ]
}

In fact, doing so we would completely free up the restrictions on the form of the ID and let the functions choose whatever name they want instead of ZKeys.

An alternative solution suggested by user Michael Ringaard is to structure function calls instead as { Z1K1: Z10000, Z10000K1: … }, where Z10000 is the called function. This unifies the representation with that of other ZObjects and removes the awkward use of keys that function calls currently have. The interpretation is that a function is the type of its function calls, or conversely, types are functions that can be called to construct instances of that type (similar to constructors in object-oriented programming). See also this discussion.

It is not clear if and how overloading the notion of ZKeys for both object keys and function arguments is problematic (could it clash with generic types?).

Local keys as fields of objects of generic types

Similarly to local functions, generic types are achieved by defining local types. For example, the generic list type is currently defined by a function that takes a type as an argument Z881K1 returns a local type with two fields K1/head and K2/tail.

{
   Z1K1: Z4/type,
   Z4K2/keys: [
       Z3,
       {
           Z1K1/type: Z3/key,
           Z3K1/value type: {
               Z1K1: Z18/argument reference,
               Z18K1/key id: Z881K1
           },
           Z3K2/key id: "K1",
           ...
       },
       {
           Z1K1/type: Z3/key,
           Z3K1/value type: {
               Z1K1: Z7/function call,
               Z7K1: Z881/typed list,
               Z881K1/type: {
                   Z1K1: Z18/argument reference,
                   Z18K1/key id: Z881K1
               }
           },
           Z3K2/key id: "K2",
           ...
       }
   ],
   ...
}

The list [“a”, “b”, “c”] can then be written as:

{
   Z1K1/type: {
       Z1K1: Z7/function call
       Z7/function: Z881/typed list
       Z88K1/type: Z6/string
   }
   K1: “a”
   K2: {
       …
   }   
}

Unlike functions however, there is no ambiguity that can arise because the local keys are always referring to fields of the Z1K1 of the current object.

This is not understood to always be equivalent to using global keys. For example, could one write a function like this?

{
   Z1K1/type: Z8/function
   K1/argument declarations: …,
   K4/implementations: …
}

This would clash with the local key notation needed by function calls, since a function call would need to specify both Z7K1 of the function call and the K1 argument of the function being called. However that would be possible if we change the way functions are called as described in the previous section (or changing the way Z7 are represented).

Local keys as field IDs in generic type definitions

Similarly to functions, one could ask whether it is necessary to use local keys to identify fields in generic types. That is, could we use bogus global ZIDs for the keys (and/or the identity field) instead? Could one write the type as:

{
   Z1K1: Z4/type,
   Z4K1/identity: "Z80000",
   Z4K2/keys: [
       Z3,
       {
           Z1K1/type: Z3/key,
           Z3K1/value type: {
               Z1K1: Z18/argument reference,
               Z18K1/key id: Z881K1
           },
           Z3K2/key id: "Z80000K1",
           ...
       },
       {
           Z1K1/type: Z3/key,
           Z3K1/value type: {
               Z1K1: Z7/function call,
               Z7K1: Z881/typed list,
               Z881K1/type: {
                   Z1K1: Z18/argument reference,
                   Z18K1/key id: Z881K1
               }
           },
           Z3K2/key id: "Z80000K2",
           ...
       }
   ],
   ...
}

And the [“a”, “b”, “c”] list as:

{
   Z1K1/type: {
       Z1K1: Z7/function call
       Z7/function: Z881/typed list
       Z88K1/type: Z6/string
   }
   Z80000K1/head: “a”
   Z80000K2/tail: {
       …
   }   
}

Should we allow such usage? It is not clear what the advantage would be, given that there is no problem of ambiguity such as for functions, but there doesn’t seem to be any harm to it (any more than for local functions). Note also that this is already in use! The type of `Error values` are generated by a function and currently these generated types uses global keys.