Term rewriting

December 7, 2018 · View on GitHub

Living Document. J. S. Choi, 2018-12.

Core Proposal

Pipelines can be rewritten into a nested do expression. There are many ways to illustrate this equivalency. (It can also be less simply rewritten without do expressions.) The simplest way is to use a single do expression containing a series of autogenerated variable declarations, in which the variables are arbitrary except they are lexically hygienic: that is, the variables can be anything as long as they do not conflict with other already-existing variables.

With this notation, each line in this example would be equivalent to all the other lines.

1 |> # + 2 |> # * 3;

// Static term rewriting
do { const _0 = 1, _1 = _0 + 2; _1 * 3; };

// Runtime evaluation
do { const _1 = 1 + 2; _1 * 3; };
do { 3 * 3; };
9;

Consider also the motivating first example above:

promise
|> await #
|> # || throw new TypeError(
  `Invalid value from ${promise}`)
|> doubleSay(#, ', ')
|> capitalize
|> # + '!'
|> new User.Message(#)
|> await stream.write(#)
|> console.log;

This would be statically equivalent to the following:

do {
  const
    _0 = promise,
    _1 = await _0,
    _2 = _1 || throw new TypeError(
      `Invalid value from ${promise}`),
    _3 = doubleSay(_2),
    _4 = capitalize(_3);
    _5 = _4 + '!',
    _6 = new User.Message(_5),
    _7 = await stream.write(_6);
  console.log(_7);
}

Suppose that we can generate a series of new lexically hygienic variables (_0, _1, _2, _3, …). Each autogenerated variable will replace every topic reference # in each consecutive pipeline step. These _n variables need only be unique within their lexical scopes: they must not shadow any outer lexical environment’s variable, and they must not be shadowed by any deeply inner lexical environment’s variable.

With this notation, then in general, given a pipeline:
𝐸₀ |> 𝐸₁ |> 𝐸₂ |> … |> 𝐸ᵤ₋₂ |> 𝐸ᵤ₋₁,
…then that pipeline is equivalent to:
do {
  const
    #₀ = 𝐸₀ ,
    #₁ = sub(𝐸₁, #₀) ,
    #₂ = sub(𝐸₂, #₁) ,
    … ,
    #ᵤ₋₂ = sub(𝐸ᵤ₋₂, #ᵤ₋₃) ;
  sub(𝐸ᵤ₋₁, #ᵤ₋₂) ;
},\

If 𝑃 is a bare function call – then sub(𝑃, #) is 𝑃 ( # ).
If 𝑃 is a bare awaited function call – then sub(𝑃, #) is await 𝑃 ( # ).
If 𝑃 is a bare constructor call – then sub(𝑃, #) is new 𝑃 ( # ).
If 𝑃 is in topic style – then sub(𝑃, #) is 𝑃 but in which all unshadowed instances of the topic reference # are replaced by #.

Additional Feature BP

Using the same notation from the first subsection, then in general:

If 𝑃 is a bare function call – then sub(𝑃, #) is 𝑃 ( # ).
If 𝑃 is a bare awaited function call – then sub(𝑃, #) is await 𝑃 ( # ).
If 𝑃 is a bare constructor call – then sub(𝑃, #) is new 𝑃 ( # ).
If 𝑃 is in topic style – then sub(𝑃, #) is 𝑃 but in which all unshadowed instances of the topic reference # are replaced by #.
If 𝑃 is in the form { 𝑆₀ ; 𝑆₁ ; … ; 𝑆ᵥ₋₂ ; 𝑆ᵥ₋₁ ; } – then sub(𝑃, #) is do { sub(𝑆₀, #) ; sub(𝑆₁, #) ; … ; sub(𝑆ᵥ₋₂, #) ; sub(𝑆ᵥ₋₁, #) ; }.

Additional Feature NP

Adapted from a previous example:

x = (a, b, ...c, d, e)
|> f(##, x, ...)
|> g;

This would be statically equivalent to the following:

x = do {
  const
    [_0, __0, ...s_0] = [a, b, ...c, d, e]
    _1 = f(__0, x, ...s_0);
  g(_1);
};

Another one:

x = (a, b)
|> (f, # ** c + ##)
|> # - ##
|> g;

This would be statically equivalent to the following:

x = do {
  const
    [_0, __0] = [a, b]
    [_1, __1] = [f(_0), _0 ** c + __0];
    _2 = _1 - __1;
  g(_2);
};

From a previous Lodash example:

x = number
|> `${#}e`
|> ...#.split('e')
|> `${#}e${+## + precision}`
|> func;

This would be statically equivalent to the following:

x = do {
  const
    _0 = number,
    _1 = `${_0}e`,
    [_2, __2] = [..._1.split('e')];
  func(_2, __2);
};

…which of course may be simplified to:

x = do {
  const
    _0 = number,
    _1 = `${_0}e`,
    [_2, __2] = [..._1.split('e')];
  func(_2, __2);
}

From a [previous WHATWG Streams example][WHATWG Streams + CP + BP + PF + NP] with Additional Feature BC:

x = value
|> (#, offset, #.byteLength - offset)
|> new Uint8Array
|> await reader.read
|> (#.buffer, #.byteLength)
|> readInto(#, offset + ##);

This would be statically equivalent to the following:

x = do {
  const
    [_0] = [value],
    [_1, __1, ___1] = [_0, offset(_0), __0.byteLength - offset],
    _2 = new Uint8Array(_1),
    _3 = await reader.read(_2),
    [_4, __4] = [_3.buffer, _3.byteLength];
  readInto(_4, offset + __4);
};

…which of course may be simplified to:

x = do {
  const
    _0 = value,
    _1 = _0,
    __1 = offset,
    ___1 = _0.byteLength - offset,
    _2 = new Uint8Array(_1),
    _3 = await reader.read(_2),
    _4 = _3.buffer,
    __4 = _3.byteLength;
  readInto(_4, offset + __4);
}

Using the same notation from the first subsection, then consider any pipeline:
𝐸₀ |> 𝐸₁ |> 𝐸₂ |> … |> 𝐸ᵤ₋₂ |> 𝐸ᵤ₋₁
…in which, for each i from 0 until n−1, 𝐸ᵢ is either:

A single expression 𝐸ᵢ[0] (which may start with ...).
Or an argument list ( 𝐸ᵢ[0] , 𝐸ᵢ[1] , … , 𝐸ᵢ[width(𝐸ᵢ)−2] , 𝐸ᵢ[width(𝐸ᵢ)−1] ), where each element of the argument list may be an expression, an expression starting with ..., or a blank elision.

The last pipeline step, 𝐸ᵤ₋₁, is an exception: it must be a single expression that does not start with ..., and it cannot be a parenthesized argument list either.

The pipeline is therefore equivalent to:
do {
  const
    [ #₀[0] , … , #₀[max topic index(𝐸₀)] , ... #₀[r] ] =
      [
        𝐸₀[0] ,
        𝐸₀[1] ,
        … ,
        𝐸₀[width(𝐸₀)−2]
        𝐸₀[width(𝐸₀)−1]
    ] ,
    [ #₁[0] , … , #₁[max topic index(𝐸₁)] , ... #₁[r] ] =
      [
          sub(𝐸₁[0], #₀[0], #₀[1], #₀[2], #₀[r]) ,
          sub(𝐸₁[1], #₀[0], #₀[1], #₀[2], #₀[r]) ,
          …
          sub(𝐸₁[width(𝐸₁)−2], #₀[0], #₀[1], #₀[2], #₀[r]) ,
          sub(𝐸₁[width(𝐸₁)−1], #₀[0], #₀[1], #₀[2], #₀[r]) ,
      ] ,
    [ #₂[0] , … , #₂[max topic index(𝐸₂)] , ... #₂[r] ] =
      [
          sub(𝐸₂[0], #₀[0], #₀[1], #₀[2], #₀[r]) ,
          sub(𝐸₂[1], #₀[0], #₀[1], #₀[2], #₀[r]) ,
          …
          sub(𝐸₂[width(𝐸₂)−2], #₀[0], #₀[1], #₀[2], #₀[r]) ,
          sub(𝐸₂[width(𝐸₂)−1], #₀[0], #₀[1], #₀[2], #₀[r]) ,
      ] ,
    … ,
    [ #ᵤ₋₂[0] , #ᵤ₋₂[1] , … , ... (#ᵤ₋₂)ₛ ] =
      [
          sub(𝐸ᵤ₋₂[0], #₀[0], #₀[1], #₀[2], #₀[r]) ,
          sub(𝐸ᵤ₋₂[1], #₀[0], #₀[1], #₀[2], #₀[r]) ,
          … ,
          sub(𝐸ᵤ₋₂[width(𝐸ᵤ₋₂)−2], #₀[0], #₀[1], #₀[2], #₀[r]) ,
          sub(𝐸ᵤ₋₂[width(𝐸ᵤ₋₂)−1], #₀[0], #₀[1], #₀[2], #₀[r]) ,
      ] ;
  sub(𝐸ᵤ₋₁, #ᵤ₋₂[0], #ᵤ₋₂[1], …, #ᵤ₋₂[width(𝐸ᵤ₋₂)−1]) ;
}.

[TODO: Define width(𝐸) and max topic index(𝐸).]

If 𝑃 is a bare function call – then sub(𝑃, #[0], #[1], …, #[m]) is 𝑃 ( [TODO] ).
If 𝑃 is a bare awaited function call – then sub(𝑃, #[0], #[1], …, #[m]) is await 𝑃 ( [TODO] ).
If 𝑃 is a bare constructor call – then sub(𝑃, #[0], #[1], …, #[m]) is new 𝑃 ( [TODO] ).
[TODO] If 𝑃 is in topic style – then sub(𝑃, #[0], #[1], #[2], #ₛ) is 𝑃 but in which all unshadowed instances of the primary topic reference # are replaced by #[0], unshadowed instances of the secondary topic reference ## are replaced by #[1], unshadowed instances of the tertiary topic reference ### are replaced by #[2], and unshadowed instances of the rest topic reference ... are replaced by ... #ₛ.

[Unix pipe]: https://en.wikipedia.org/wiki/Pipeline_(Unix [untangled flow]: ./goals.md#untangled-flow [Visual Basic’s select statement]: https://docs.microsoft.com/en-us/dotnet/visual-basic/language-reference/statements/select-case-statement [WebKit console variables]: https://webkit.org/blog/829/web-inspector-updates/ [WHATWG Fetch + CP]: ./core-real-examples.md#whatwg-fetch-standard [WHATWG Fetch Standard]: https://fetch.spec.whatwg.org/ [WHATWG Streams + CP + BP + PF + NP]: ./additional-feature-np.md#whatwg-streams-standard-core-proposal--additional-features-bppfmt [WHATWG Streams + CP + BP + PF]: ./additional-feature-pf.md#whatwg-streams-standard-core-proposal--additional-feature-bppf [WHATWG Streams Standard]: https://stream.spec.whatwg.org/ [WHATWG-stream piping]: https://streams.spec.whatwg.org/#pipe-chains [Wikipedia: term rewriting]: https://en.wikipedia.org/wiki/Term_rewriting [zero runtime cost]: ./goals.md#zero-runtime-cost