8 June, 2022 Meeting Notes
July 9, 2022 · View on GitHub
Remote attendees:
| Name | Abbreviation | Organization |
|---|---|---|
| Michael Saboff | MLS | Apple |
| ¦ Istvan Sebestyen | IS | Ecma International |
RegEx atomic operators
Presenter: Ron Buckton (RBN)
RBN: Good morning. So today, I'll be talking about a proposal for adding Atomic operators to ecmascript regular expressions. Atomic operators are designed to prevent matching patterns that have already been excluded by alternatives. The goal of this is to avoid backtracking in certain cases where you want to match a specific set of characters, but don't want to possibly include other characters that come after it in the event that there is overlap in those matches. So you can see in the above example, the first pattern that uses a regular group would possibly match both ABCC or ABC because when it first tries to match ABC, it would match the A in the first part, the BC in the capture group, fail to match the C the third term. Backtrack to the beginning of that group, then match the B and then try to match the C again. So this is a capability of current capture groups, but an atomic group would then prevent that behavior. Where if you match the BC within the first capture group, it would then not backtrack and try again when it fails to match the following character.
RBN: So, one of the other capabilities is that atomic groups and other atomic operators allow you to improve performance by avoiding catastrophic backtracking. This is often a common factor in regex denial of service attacks, or vulnerabilities rather. An of this you can see here of catastrophic backtracking is an input that consists of about 30 characters that repeat and one that doesn't match that is expected to be the terminal character. And in a regular pattern you could attempt to match some mix of repeating characters and non-repeating characters that Loop plus a final terminal characters in this case any number of a characters or single x character any number of times in the match should be, this is catastrophic backtracking because every time it attempts to match It will first try to match all 29, in this case, A characters, failed to the final B character. Then it will backtrack to the beginning of the group. This also occurs with quantifiers because repeat matchers essentially add an infinite number of possible alternatives. So then it's going to try matching 28 As and possibly an X. And then when it matches 28 As. It's going to see the repeat for the quantifier and then try to match a the 29th A again, so you can see this will constantly retry the same expression over and over and over again, and the operation is O(2^n). So it's an exponential operation.
RBN: Atomic groups, significantly improved performance here, an atomic group around this would attempt to match first. The Full set of eight characters successfully match. And then when it fails to match the B character, the entire match fails. So, the two types of atomic operators I'm looking to propose today are an atomic group and possessive quantifiers, and I'll go into a little bit more detail about those in later slides.
RBN: But first, it's important to talk about what backtracking is. We had this discussion when I presented this at a previous plenary. Backtracking occurs when the next alternative is attempted in a subpattern on the left, when the rest of the sub pattern on the right fails. So again. Here you can see A attempts match the first A of the string, but then fails to match the C when it attempts to match the second character, so we've tried the group and then we tried the following term. When that fails then backtrack to the beginning of that group and we try the next alternative. So, in this case, we successfully match AB first two characters and then successfully match C to following character. How we handle this in the spec today is via continuation passing. I've included an example of the spec text, a matcher that takes in a continuation. Even when the match on the left fails on step 3.c., which is great on set, 3.d. then we attempt the match on the right passing.
RBN: How Atomic operations change this is that when a sub pattern on the left is matched, it does not pass the continuation on. Rather we pass an identity continuation. So in this example, we match the first part of the alternative and that successfully matches the string, the first character of the string ABC, and then we go on to continue the matching regardless of whether or not - we do not backtrack the beginning of this regardless whether this fails. It's important to note that there are existing Atomic operators within ecmascript regular expressions. And those are lookaround operators. So all lookarounds match independently of the rest of the pattern. And you can see this happens by passing an identity continuation here on the step 2c we created a new continuation that just takes in the current state and returns it. And then when that completes, we call the continuation on Step 2.j directly rather than passing it along.
RBN: As I mention there are four existing Atomic operators with that ecmascript positive. The - look ahead and positive and negative lookbehind. This proposal is looking to introduce two new Atomic operators: atomic groups and possessive quantifiers.
RBN: Atomic groups. An atomic group is a non-capturing group that matches completely independently of neighboring patterns. So again if some pattern on the term on the right fails to match, we do not backtrack into that group. They avoid backtracking when you don't want back tracking when it would have a negative impact on the expected match. It also helps to avoid catastrophic backtracking as mentioned before. If we decide to add this capability, there is no no conflict with existing regular expression syntax. This can be used in or out of Unicode mode. So this can be used even in regular regular expression (?). And there's significant prior Arts. This slide does not list all of the engines to support this, just a subset of the ones that I've specifically individually tested.
RBN: Here are some more examples of the differences between regular un-atomic matching. In the first example, we show how again normal groups can backtrack. So here with a repeated set of As. We would first try to match all three because this is a what's known as a greedy quantifier. It will attempt to match all characters first, then it will fail, then it will backtrack and try to match one fewer characters and that fails, It will step back one and try to match another fewer characters. And will fail. But since it fails to find the B, that this doesn't match, this means that we try every permutation of the string. In the second example, we would match two A characters and then fail to match the following AB character. and then in the next attempt as it backtracks, it would then match the single A character and then match a B succeeding a match. So this can be a well-defined and expected Behavior. So, it is not that this behavior is incorrect, or needs to change, but rather that there is an alternative that may need to exist for certain cases. That'll turn to the end with an alternate group backtracking state is discarded as soon as you exit the match. So this means that the same example from the normal group, the atomic group would match all three A characters, failed to match B, and then the match ends because it will not try to Go back to matching two A characters like it does with normal groups.
RBN: And also, in example for this effort, second example, in this case, this will never match because the atomic group will match all of the A characters in the string and since that succeeds, it will not attempt to backtrack when it matches the second term in that pattern. It is important to note that backtracking inside an atomic group is still preserved. So if you are in it a disjunction or a repeat matter inside of that group and there are Characters that follow are there are patterns segments, that follow that set of Alternatives, There is back tracking still within that group.
RBN: The specification text for an atomic group is dead Simple. It is essentially the same as we would do for a look around. Again, This very closely matches the look around specification text with the difference that it passes the return State on rather than trying to construct a new state to support the zero width group that you were constructing with a look around.
RBN: Possessive quantifiers are very similar to Atomic groups. As a matter of fact, you can mostly consider them to be a shorthand. In this case, they are greedy quantifiers, but they don't give anything back when it comes to backtracking. So an example, would be some character star matches 0 or more; if you add the plus character following that it matches zero or more but does not backtrack. And the same occurs for all other quantifiers with the exception of the matches exactly, the curly and curly. Only because that has no actual difference in Behavior. And as, with the atomic groups, there's no conflict with existing syntax. The prior art is similar to the prior examples with, I believe, the exception of the .NET regular expression engine doesn't currently support possessive quantifiers. However, all other engines that I have tested do support this.
RBN: And here's some again, additional examples between greedy and lazy quantifiers. Greedy quantifiers will start trying to match every character that could possibly match And then if the match on the right, then fails at then steps back one character at a time in the repeat matcher until it, finds us the successful match. Lazy quantifiers do that matching in Reverse. They start trying with the first. The least number of characters in the group and match on the right and then if that fails will add the next matching character and then try the match again on the right. So possessive quantifiers again, are greedy quantifiers which do not backtrack. So they will not attempt to step one character back if they have found a successful match.
RBN: So, for possessive quantifiers, again, the specification text is fairly simple. CompileQuantifier we just recognize the additional plus character after the prefix and similar to the question character for lazy quantifier. And just as, with an atomic group, if the quantifier is an atomic quantifier, we create a lazy continuation, and pass that to the repeat matcher. So that if the repeat matcher succeeds, we don't walk back within the repeat matcher. We instead will call the continuation directly.
RBN: This is a stage zero proposal. I'm interested in seeking stage 1, and at this point, I can go to the cube there questions.
WH: I just want to clarify something. You said in the presentation that possessive quantifiers of the form {n} are not allowed because they’d be a no-op. I don't believe that unless I'm misunderstanding something about the semantics of possessive quantifiers. Let’s take the example: /(a*){1}a{3}/.exec("aabaaaaa") vs /(a*){1}+a{3}/.exec("aabaaaaa"). Do they have different behavior or not?
RBN: So backtracking within a possessive quantifier is still valid. Primarily it's not supported because every engine that I've checked that supports this does not support it for an exact match in quantifiers, so I'm trying to not go too far from the trend. And if there is a case where you do want that behavior, you can still use the atomic group is a concern about whether or not it should be supported just because of consistency and I can be that could be argued for introduced or allowing syntax.
WH: Okay. So if they’re omitted because they’d be a no-op in other engines, then that brings up a question of what possessive quantifiers actually do in other engines. There are two possible interpretations and they differ in behavior. One, is that a possessive quantifier is just syntactic sugar for turning off backtracking and having a regular quantifier inside there, which is what you presented. The other alternative is that the possessive quantifier does not backtrack on the number of things it matches, but it does backtrack into its contents. Those would be different semantics. I don't know what other engines do. Do you know?
RBN: So let me see if I can find that.
WH: The case I gave on the queue distinguishes those two (/(a*){1}a{3}/.exec("aabaaaaa") vs /(a*){1}+a{3}/.exec("aabaaaaa")). None of the examples on the slides distinguish those two interpretations, but that case does distinguish those two.
RBN: The intended behavior Is that as far as I recall the and possessive quantifiers should essentially act as if you had wrapped in an atomic group. backtracking still should occur within the atomic group, but not at the boundary. So I believe that - I have to parse through the example you have here. so, if this were a regular Atomic group, wrapping the repeat of A so the repeat of A would match and then since it matches successfully and there is nothing to the right of A within its Atomic group as it were, A would match successfully and then the A 3 would match separately as though be Quantified, a star would be matched. So yes, these two things would have different Behavior. If the plus were allowed as A qualifier, the question is whether how often is this Usually the case? And it's You don't usually see a case of, I want to repeat a* once. So it's usually the reason why - at least I imagine, in other engines, the reason why it's not supported for these characters for a fixed length quantifier, is that it's not a very common case and you're probably doing something wrong. So it's not that it wouldn't work. The behavior would be the same as it would for any part of the atomic group case. It's just that it's probably not an intended use.
WH: Yes, if that is what other engine semantics actually do for possessive quantifiers. I want to double-check this because I don't know.
RBN: I've checked this in or at least everything except for this fixed case, because the fixed case, T, the for possessive quantifiers isn't supported in most engines, Just for sorry, The case for exact match for quantifiers is supported in most engines, but the behavior for The plus character being a substitute for being a syntactic sugar, for the atomic group around, the thing that would repeat does seem to be consistent. I've checked this in .NET, I've checked this in Perl and I think one or two other engines and they all seem to have the same behavior. .net I've even looked at, they have these regex Source, generators. I've looked at that, actually, let you examine the source of what your pattern that you would write, would actually execute.
WH: I just wanted to double-check that they do not, for example, not backtrack on the number of matches, but do backtrack from the outside into the individual matches.
RBN: Yes. It's essentially just discards the state at the boundary of the thing that is atomic. Okay.
WH: Yeah, in that case my comment would be that we shouldn't disallow {n}+, but that's a minor issue.
MAH: Yeah, I have to admit. I might knowledge of regular Expressions is just about using them, not much about implementation, but I am wondering if this would allow to identify a subset of regex static. Goodbye scientific analysis, that would then be guaranteed to never get a certificate. We backtrack.
RBN: I can't speak to that. I do know a co-worker who has a contact someone whose work he works. In a group that has researched regular expression, aesthetic, amounts of not static, analyzer ability for regular Expressions to determine these types of cases, and their work is awful, used in various tools to actually recognize specific cases of catastrophic backtracking and regular Expressions. I do not know if this could be used to. I do know that if you compose it regular expression, that consists only of atomic operations that correctly, it would be possible to recognize that it does not catastrophicly backtrack. I also know that it's possible to use at least within a regular expression engine static analysis to determine that if you have no backtracking that you can actually perform certain for use heuristics, that determine you don't have the ability to backtrack, that you can avoid some operations and actually significantly improve performance for other cases such as the CVE that I brought up in the last last time I presented this suffered from two issues. One was that, it was for trim new lines and matching every single new line character and then filling to match the end of the string was catastrophic both because the result wasn't Atomic and because the, our behavior for matching - if it fails to match, we then advance the index by 1. So that's something that if you have the ability to static analysis to avoid backtracking, then you can have heuristics, which significantly improve that. But I think as mentioned by WH, you can check something is definitely not going to backtrack, but what you can't do is know if something is unlikely to backtrack that there are you can compose regular expressions? Where the is possibly confusing and certainly depends on inputs.
MAH: Right before - tend to think this enables to write more regular expression with what syntax that or with behavior that people are used to such as matching multiple characters and to know that for sure this will not backtrack. I'm wondering if that was clear. You said you might be able to identify that some expressions will for sure never backtrack. I'm not asking backtrack. I'm not asking to allow all regular expression syntax, and make sure that those will never backtrack. I'm just wondering if it's possible to identify a subset of regular expressions and be sure those will never backtrack.
RBN: I think there are two replies to that that might also be informative.
MAH: Anyway, I we're really interested in looking at what develops and in working out the details. Thanks.
JRL: There are other features in regexes, particularly look around and back references, which can still cause a catastrophic backtrack. So this is a part that helps make a lot of regex is safe, but there are still other features that can cause it.
RBN: Well, look arounds themselves are atomic, so they don't backtrack at the look around. You can still write a pattern or some pattern within a look around that has backtracking just as you could with an atomic group. It's that the atomic group itself at its boundary doesn't perform backtracking. So everything to the right it would not result in re-evaluation of the thing on the left, even with a back reference, because back references only back only reference the string which was matched.
WH: You can statically tell which regular expressions are vulnerable to catastrophic backtracking without this feature at all. So, this feature is really not very relevant to your question of avoiding catastrophic backtracking.
WH: To first order, let’s define what “catastrophic backtracking” means. Let's pick a definition: catastrophic backtracking is exponential in the size of the given string. To first order, if you do not nest quantifiers, you cannot get catastrophic backtracking in a regular expression — you can only get polynomial runtime. Furthermore, there are many instances in which you can guarantee polynomial runtime even if you do nest quantifiers. And all of this can be discovered statically.
WH: So this feature is useful for some sloppy ways of writing regular expressions. But if you understand how backtracking works, you can almost always write regular expressions which do not suffer from catastrophic backtracking without needing this syntax. I'm just saying this feature may be useful for those who like to be sloppy, but it's not necessary to prevent catastrophic backtracking.
MAH: Thanks for clarifying that.
MS: So I'm going to, in some way, second what WH just said. First of all, the language that our regular expressions in ecmascript allow are not regular. There are things such as backreferences and things like that, which make the language that our regular expression matches is not regular. Backtracking is a type of matcher and it's a type of natural that, I believe, we need to have given the language we allowed our regular Expressions. However a subset of regular Expressions would allow an NFA or DFA to match, and these not do the kind of backtracking in general, that we see with the backtracking matcher. And I also agree with Waldemar that the sloppiness with which someone writes a regular expression, greatly impacts the execution speed of such a regular expression. Nested quantifiers is chief among those, which are really bad;it's difficult for a backtracking engine, which just cannot do much with that. They have to exponentially go through the searching. So I agree that this would be helpful, but it will not eliminate the catastrophic cases that people design. The classic one I think of as somebody's looking for something in the string ABC and they put .* in front of that and they put a .* after that, which is completely unnecessary. And then sometimes they even anchor it with a ^ in the beginning and a $ at the end, and at least our engine does some optimization around that, But yeah, that's a catastrophic kind of expression that is unnecessary.
MM: Would this expand - we can already statically analyze a subset that that not one that blows up? Would this proposal to expand the expressivity of the subset that can be statically shown to be safe? And if I understand correctly, the answer from Waldemar is no.
WH: It was mentioned that you can convert things which are regular expressions in the computer science sense into DFAs, and this actually impedes such conversions by not matching in some cases where the DFAs would match. But it's not a major issue.
DE: If we're talking about bringing this to stage one. I think the general idea of cut points or, you know, doing a cut off from this search given that regular Expressions aren't regular is really useful. There's a lot of details to work out. There's big questions to figure out, you know, how to explain this to programmers. There's a lot of design decisions to make, but the nonregular features of JavaScript regular expressions are really useful. And so to complement that you kind of benefit from a way to limit backtracking. So it's true that it might not - you know, that these could be overused, that it's not necessarily the best design for everyone's regular expression, but I think we've been developing good developer education materials associated with TC39 proposals. So I think we have a path forward for that. So I support investigating this further to include something in this area in the language.
KG: Yeah, Ron already knows this but just for the rest of the room. So for the properties of strings proposal that allows you to match, for example, an emoji. We made the decision - or kind of defaulted to, but whatever - the behavior for properties of strings that they are not atomic. So, for example, if I match \p{RGI_Emoji}, or just Emoji or basic Emoji or whatever, that can match an entire Emoji, but it can also match just half of an emoji and then I have the next code points be like modifiers, so, I don't know, the skin tone modifier or whatever. So you can get something that is to me quite surprising where you ask to match an emoji, but actually ended up getting only half of the Emoji. And this proposal is the only effective way to get the behavior that I actually want, which is to match the entire emoji and not stop halfway through. So this is an example of a case where atomic groups are useful completely unrelated to performance. It just allows you to express the thing that you are actually trying to do more clearly. And for that reason, I am in support of this proposal.
YSV: Yeah, so basically what [the queue] says, we have some concerns about introducing this to the language because is really a performance layer and it introduces syntax as part of that performance layer. So we're wondering how ultimately useful this will be in relation to learnability and whether or not it passes that bar. That said, we're comfortable with stage one, and we're looking forward to the upcoming work to address some of the points brought up by DE.
RBN: I would like to state that I've had conversations with the folks that built the regular expression engine for the .net language. There are two things to this subject, which I pointed out earlier, which is that these Atomic groups are not just about performance. They are also about avoiding unintended matches or unintended backtracking just as with the case mentioned, with matching emoji, with the properties of strings proposal. They can help with performance. And that is definitely a major motivating reason for them. But for my conversation, with the folks on the .net team, there are some value valuable things you can get with the ability to have Atomic groups in a regular expression engine and certain optimizations and heuristics that you can make. Such as if you know that you can match your repeating set of a single character atomically, then you also know that if you have match 30 of those characters on the wrong thing. Going all the way back to regex built-in exec. Skipping ahead, one character and matching 29 of those characters in the wrong thing. You know, that's something that you want. That won't work so you can avoid that. That's something that and Perl regular expression engines do which significantly improves performance in cases, like the cve for new lines and other read offs that are not pennies to the beginning of the string. so there are significant performance improvements that we can be that can occur when you can statically analyze that regular expression and know that such a case would occur.
WH: I support this, but I would like to see us not ban the {n}+ syntax for regularity and consistency reasons.
WH: Also somebody brought up the usefulness of this for matching emojis. I would be careful here. Think about what happens if you use atomic groups to match emojis in backwards mode.
KG: Sorry, what is in backwards mode?
WH: Reverse matching where you're matching from the end to the beginning. This happens in lookbehinds.
KG: I'm going to have a hard time thinking about that offhand. Can you say more?
WH: I don't know the entire structure of emoji sequences, so this might get complicated depending on the structure of the head and tail of the sequences. Is the tail always also an emoji character?
KG: There are some cases where the tail is an emoji and there are some cases where it is not.
WH: Okay. So if you're matching backwards and you use an atomic quantifier, maybe you can get into weird cases.
RNB: and once we have the properties of strings, proposal, or if we have an implementation of the property strange proposal that we can test this with you can verify this by using a - look behind that has a positive look ahead that contains a capturing group that is back reference. There's ways of kind of emulating some of this Behavior, Although it's shift around capture group. So you can actually check this Behavior If you have an implementation of properties of strings, that would let you match emojis or if you just manually roll the match for an emoji. So it would not differ from the behavior today.
RBN: So, is there anyone else waiting to get on the Queue? [no] At this point, I'm asking the committee for advancement stage 1.
RPR: All right, any objections to stage one? No objections and you've already had explicit support. So, congratulations. You have stage 1. And Matthew is also on the queue with a plus one.
Conclusion/Resolution
- Stage 1
Import Reflection status update & discussion
Presenter: Luca Casonato (LCA), Guy Bedford
LCA: Okay. So this is the import reflection update. The last time we presented was December 2021. Back then. I was not a delegate yet so Brad presented this. I authored the original proposal and we're going to be presenting this together now. So quick recap. This used to be called evaluator attributes. Last time we presented it. We decided to change the name to import reflection to more clearly describe what we're proposing, which is a reflection property that does not change evaluation semantics, but only changes how an asset can be represented when It is imported. So these are not evaluation attributes. They do not change evaluation. They change how a given asset is represented. This allows for yes, modules tab. alternative Mm, Reflections, that represent the Underlying asset. the basis of this is the reflection attribute that is added to static Imports. And also Dynamic, of course, that's not shown here though, which allows you to import the same specifier multiple times and get different return values from it. All these return values represent the same underlying asset, but they are usually separate stages of the loading pipeline. So in this Case For example, the first import here is import foo from foo.wasm is in this case. The module is instantiated and linked into the as module graph. So, the wise and can import other. Yes, modules and you can direct the execute code. So this is Singleton. Wasn't that you import and it is exist. Once in the S module graph as an instance, for the specifier. Secondly, you can also import it as a nun, instantiated unlink module. This allows you to import the vicinity of the unlink module. In instantiated. However, many times want. So, let's say you have instead of a library. You have a CLI or something of that sort where you run to completion. You don't want it to execute just once, but you may want to be able to execute that multiple times. This allows you to do that. so the reflection attribute determines in what form Divine time, should expose the underlying asset to the user, and there's usually a default reflection if you specify the reflection attribute. last time we presented this proposal, we had a very broad scope. We had not yet figured out exactly what we wanted to do. this time. We're going to present two primary scenarios, a modular reflection. So that is the thing that I just showed where you have a flip, this one reflection, which has a loaded and compiled import and another one where you have a a loaded and compiled and linked and executed. And the other one where you have a loaded and compiled but unlinked module and then the second one is acid reflection, which gives you a note, an asset reference which represents a resource that may not have been loaded. I'll get more into that later.
GB:Yeah, the so as Luca showed in the first example, one of the motivating use cases we mentioned in last at the last meeting in December is the Wasm use case. Where at the moment most practical webassembly workflows today are using this fetch and compileStreaming approach which in many ways is basically arbitrary binary execution. And so it falls under this unsafe-wasm-eval CSP policy. So the benefit of this reflection syntax is that we can, by directly linking into the module system, we can link into the module system CSP policy, but more generally than CSP, we're linking into the import system for this module reference, that you're getting back when you're getting back, this webassembly module. So that we know what is actually being executed on the platform and that's a huge benefit over just permitting arbitrary webassembly binaries. So, you actually allow the import system to be in control of what's executing, which is kind of a primary responsibility for the import system.
GB: and then there are actually further analysis and builds tooling benefits of this process. So now that you have this static reference to the module record, build tools can actually see that this particular web assembly module is being built in and they can relocate that module during builds as something that builds tools cannot do today because they don't know where it is often very dynamic code that's determining where that binary is. And then even inlining approaches - because it's an execution semantics, even if it's an execution reflection semantic, you can still actually do execution analysis across that boundary. So by exposing at a syntactic level this execution linkage down to the parts that we know, even though it's still a highly dynamic reflection process, we can at least improve the guarantees and potentially improve the security benefits as well.
GB: And so, yeah, just just to summarize those. So the benefits are that we're sharing the security model with static imports. We're now making it clear that this is not an arbitrary evaluation and that's something that allows the host to be in control of what's executing through the system that's beneficial in browsers for CSP, and that's beneficial in server platforms like deno as well. And the other important point around this is that this feature is actually needed for webassembly today. Webassembly at the moment would not - the majority webassembly applications would not be able to utilize the native ESM integration the way it works right now. A lot of the webassembly practical worflows involve instrumenting the Imports, doing custom wrapping that requires the web assembly module object, and then immediately falling back to that low level of analysis and security. Whereas it's something that would help or enable some degree of integration. And in the far future, it's something that webassembly itself is looking to have as a capability to have things like that with the component model, but that's more of a far future example.
GB: So we last presented this in December of last year. And generally it seemed like there was a positive reception to the proposal, although there were a lot of questions about its relationships to other module proposals. For example, compartments, module blocks, deferred modules, and some suggestions about investigating common abstractions or ways that the same reflection could even apply to JS modules and what that would look like on the platform. And so we went off and did some investigating into some of these problems and the rest of this presentation, we can present specific Reflections. It's worth noting before we get too into the weeds, that the reflection proposal itself is purely a mechanic of reflection. It's exposing the ability to reflect. And then we have these specific reflections and there's some layering questions around that, but I'll discuss that at the end.
GB: Yeah, so just to look at an illustrative example of what JS reflection might look like and all of this is completely hypothetical just to show what it might look like.
GB: So here is a simple example of reflecting a JS module graph, using module reflection to get back a higher-order module that represents compiled source text unlinked and that can be instantiated multiple times. So, in this example we're importing a module, and we're importing the dependency of that module as module reflections, and we're getting back a reflected JS module object. And then we in this proposal, It's assuming a global "ModuleInstance", which allows constructing instances out of the compiled module similarly to WebAssembly.Module and WebAssembly.Instance, being separate objects separate classes. So in this pattern, you can instantiate both of these modules as many times as you like and each instantiation can have a unique linkage and each instantiation will only ever execute once. So we create those two instances and we can then link at any point in time as long as we finish the linking before we execute, we link all the modules together. And the linking function takes 2 arguments, it takes an unresolved dependency specifier for the module and then the second argument is the other module instances you're linking against. One benefit having a single module instance class is that's the same no matter what want you are linking. If I change this to WebAssembly, I could still use new ModuleInstance, against the WebAssembly in this example. There's some details to work out, but this design is solving certain constraints to be able to have a certain level of ease of use. Once every module has been linked you can call evaluate. If you call evaluate when the all modules are not linked it would throw. Evaluate could possibly complete synchronously on the non top level await portion of the synchronous graph before returning, a promise for the from the point of the first await in graph - that could be an interesting refactoring to make. And then once evaluated you would have access to the namespace on this module instance. So the module instance could expose everything we have currently or not everything but a lot of the properties have currently on cyclic Module records, for example, you have the namespace, the module State, and things like that. So it could reflect the entire evaluation process down to some level that allows this this kind of custom module graph construction. And considering the amount of power that is being given here the API actually relatively straightforward to use and I think that's that's that's a really nice benefit of this approach to custom module management.
GB: so, in the proposal, you would need a reflection of a module to be defined as a new Js Object. So in this example, when you reflect a js module, you're getting back a source text module record that could for example, have Getters for the imports and exports that return some representation of the imports and exports of the module. It's worth noting that the reflection itself needs to be reflected in the imports, but that can all be handled.
GB: And then this module instance, which is the singular class that represents an individual Stateful instance, from the moment. It's created until the moment it's evaluated and it's a unique linkage and a unique execution. Firstly, you can have a number of Getters for the underlying module record. You can have the resources, the state from all the way from unlinked to async evaluation to evaluated. Providing an error getter for error handling is something that needs to be worked out. One interesting point about this specification is, as this proposal, this hypothetical proposal as designed wouldn't actually expose your realms at all because we don't have any concept of normalized specifiers or anything like that in 262. So instead we can expose the meta for an object instance that's host generated at creation of the module instance, even before it's been linked or evaluated. One benefit of that. Is that would allow the loader to access to the meta URL or other host specific context for the module or for that meta object, as well as because it's mutable the ability to modify that matter object for the module before evaluating the module. And then the linking function is as described and the evaluation function is as described. So again, a lot of power but despite all that it's actually a relatively simple thing, the end of the day.
And this slide is purely, not to show the code example, but rather just as a demonstration of the fact that creating a whole loader is not that hard with this API. So this is a complete user land loader, that can have a custom resolver implemented on a single slide using this API. So, basically something like one of the major benefits of the linking modular model is that it doesn't require the user to to know anything about the execution invariance or anything like that. They can - as long as everything is linked it doesn't matter. You can link in preorder or postorder or random order or whatever as long as everything is linked by the time you evaluate. It'll all work out correctly to the semantic invariants of the execution and support all the features from top level await to Cycles. And cross linkage, and that's a benefit of this kind of instance model where you're getting these instances before they've been linked and so you can link unlinked instances and it's just a nice way of managing that the graph state to treat it as a phased process.
GB: So, in relation to other proposals, there are a bunch of cross-cutting concerns between these things. So just to briefly go into some of these cross-cutting concerns. The instance proposal again hypothetically as designed and shown previously would actually kind of replace WebAssembly.Instance, and the benefit of that is that we're turning the instance into an accurate record as opposed to trying to interop with some webassembly instance record. That is something that you can easily reason about. So when you, when you first reflected as module you, it's fine to still get a WebAssembly.Module object, and the module instance would instantiate from the Webassembly.Module object. But instead of giving you a WebAssembly.Instance you could get this new modern object. So that you have all the states and information associated with every other instance, and you could sort of see the evaluation state and it can fully integrate into the graph in this way as opposed to being some kind of special wasm module that behaves differently to everything else. Which could be a benefit, it's obviously possible to design it to work with WebAssembly.Instance as well and it's an interesting discussion area, but this is a design I would like to call out.
GB: So this is just mentioning that we don't actually use anything from WebAssembly.Instance that that is in the js api right now that we wouldn't have access to whereas WebAssembly.Module has a lot of custom properties and things like reading custom sections in the web assembly module that are quite webassembly specific that you would want for the web assembly module reflection. Whereas if we weren’t getting a WebAssembly.Instance, there isn't any loss.
GB: Then in terms of supporting WebAssembly.Module as a reflection, to just getting a little bit more into the fine grain details of how these specs interrelate. So the last bullet point - so when you create module instance from a webassembly module, we could have some kind of Internal slot on the host object that the host webassembly object of relates to our module record, and we could have some kind of machinery for ECMA-262 to see. That this host object represents a module record. This special module record can be related to a backing ECMA-262 module record so that we can reason about it. So, yeah, it's mostly just straightforward mechanics, but some details worth mentioning.
GB: So yeah, module blocks. Another nice thing about this new module instance structure, is that you could in theory take any module such as module block. So we could have all these different types of modules on the platform, like a block or whatever. And you could pass a modular block, into new ModuleInstance as well, and at the moment module blocks are singular in that, there's one instance in a given context that you import. Whereas, if we had this module instance machinery, you could actually multiply instantiate a single module block. You have multiple instances out of a single module block or multiple linkages out of a single module block. So you could do things like, use as a mocking process from mocking libraries. Have it built up for your tests and then throw that for the stuff away once you're done with it. And again, there's no path dependence on these things in the way that this is kind of being suggested. These kind of features are optional and additive to both specifications. I think that's quite enough thing to think about, in terms of the layering of all this stuff and how it integrates. Is that it as long the paths are open and there are ways, then things can kind of move at their own different pieces around this.
GB: So, in relation to compartments, there would definitely be a huge amount of benefit if we could share the module and instance definitions with compartments. And so, that's a big question and discussion and I've discussed briefly with Kris some of these details, but that's something we need to definitely discuss further. So, what isn't included in this current suggested design is custom global environments, linking boundaries, or anything to do with the loader definitions, but those things could potentially be seen as additive to this kind of minimal reflection of what our module reflection might look like. So there's a few ways to go about it. We could possibly specify this very basic JS reflection as part of our reflection work with this proposal. Alternatively we could just treat it as an arbitrary reflection and rather shift that to the compartments side, and rather say that - so then there's those kind layering discussions to be had.
GB: And then deferred modules as well. So, when you're importing a module reflection because it hasn't been evaluated. You are effectively lazily loading that module in a sense, but it's not a comprehensive load because you're not loading and resolving the dependencies of that module. So for most pre-loading or lazy loading scenarios, you probably want it to be pre-loading the entire module graph or doing work at that entire module graph level for the actual execution instance. And so we do think these pre-loading and deferred evaluation problems are best seen as separate to this kind of reflection work, at least as far as we've been able to dig into the problem space.
GB: So in terms of the actual host hooks that would be exposed, as mentioned, the base reflection proposal is just a reflection mechanic. And one of the important things when adding that reflection mechanic is obviously to retain idempotence of the resolver. And because of the fact that the same asset can have multiple representations but we want to make sure that we're still referencing the same asset. Yeah, so you want to make sure that if you're reflecting a resource and you do two separate reflections for the same resource that both of those reflections are still referring to the same resource. So we need to maintain that idempotence. And the only to do that is, by actually defining what you mean by a resource. so, this is why it forces us to actually define some kind of asset or resource structure, still without defining canonicalization or URLs or anything like that. And so if we separate the resolver into these two phases: one resolving some kind of opaque asset resource that represents the actual resolved thing, and then a separate reflection function which can say, well, this is the resolved assets. And then you can get the different reflection types on it. And then we can effectively enforce these new idempotency properties on those two hooks together. So, that's the gist of how we're kind of thinking about this. Yeah, and if there's any questions on that we can discuss it further.
GB: So what do we get out of this reflection? Well, firstly and the driving use case, ideally with just a simple reflection mechanic, it allows webassembly to define that it's going to permit this reflection in the platform. And that's something that is needed at the moment for the wasm integration and can be unblocked by this work and something that can start moving forward. And that's the kind of immediate use case. <second bullet> So yeah with the invariants of the hook we could then state that certain reflections are reserved for es262, that the module reflection for a JS module is reserved and then we could add this JS reflection at any point of time. As I say, we would be happy to specify something minimal in this specification or not. Either can work for us. And then it also permits new host defined reflection types in the future and we can maybe have some wording about what sort of Reflections are permitted. But there could be some other interesting reflection types enabled and the second example that we want to bring up is asset reflection.
LCA: So asset reflection, this is based on the asset references proposal, that was presented in 2018, if I recall correctly, which provides a way to get an unforgeable reference to an asset by means of creating an essentially wrapper object around a results but specifier there's unforgeable using static syntax. So this is what the syntax look like in that proposal. You replace the import keyword with asset keyword and And you can use this asset reference then to pass that to APIs that already take resource identifiers such as fetch or import. And currently, this is often done by using new URL and -import.meta.url. This does not really work very well though, because a) it does not actually go through the proper host resolver. I'd rather it just, it only works if the host resolver, only uses the ultimate (?), which is not the case of things like node.js or if you're using import maps, for example, where a specifier could map to There's some results. Specifier amended is. And then b) it's also dynamic, which is difficult to statically analyze versus a static syntax. You want to use static syntax for this because we want to make it easier for bundlers to find these references. So they can process them for their work. What I'm going to show is that this asset reference is really also just another reflection. So asset reference can be represented. as part of this, as a reference or import Reflections proposal because as a references are really just a reflection of the assets prior to loading the asset. So they are they take the resolved, specifier wrap that in an opaque object and don't actually perform load. And this means that one could perform the - one could have the asset references proposal happen without requiring additional syntax, it could just be another reflection that is part of the input reflections proposal or later and you could even be done in like as a host extension outside of TC39, for example, in HTML.
LCA: Yeah, so that's the presentation for today. We'd like to start the discussion now, so we'd like feedback on the overall shape of the proposal about the JS reflection API that guy presented and about how this interacts with us, it reflection and also how this interacts with all the other proposals that we mentioned module blocks compartments and similar. How this? Yeah, this layering between those proposals and we're looking for stage two reviewers. So, is there anything on the queue?
KKL: Thank you for the presentation. As you pointed out there is a lot of overlap with the compartments proposal, which is stage one and we invite you both to join the champion group for that since there's so much well considered material between this presentation and what we've accumulated for the compartments proposal. The compartments proposal, just by way of update for this group, the champion group has decided to limit the scope of that proposal to just solving the problem of JavaScript’s missing module loader API, so, evaluating modules in general. And integration with WASM is part of the scope of the concerns that we've been considering over the last couple of years. The only portion of this presentation not covered by the compartments proposal as-writ is a mechanism for statically analyzing a non-executed dependency. That is to say expressing a dependency for which you wish to defer execution, which is super useful as pointed out for bundling use cases and such where you want to execute later but declare that dependency so that it is statically analyzable and so that the bundler can retrieve the transitive deps. Compartments do do a few things relevant to this proposal. For example, they do already reflect static module records. And as this proposal proposes. We are proposing a separation of module instances from the reification, and the replication module environment, records. The shared loader, which compartments have shared loader caches, which do not necessarily refer to a static, a synthetic static module record. So it is possible. There are complementary semantics in compartment proposal that would use, for example, if you were to use, if you were to use the import reflection to State a dependency that you do not wish to execute you, if that would be beneficial in combination with using compartment to pass the cached static module record to another compartment where could be executed later or possibly multiple times in multiple compartments, which is of course also relevant to hot module replacement. The compartments proposal has no less power but it does encapsulate a few more concerns, and that is something that we're open to iterating upon. The compartments proposal hides linkage as a concern, and it doesn't reduce the power of the proposal. Anything that could be linked, before can be linked in with compartments, but that's something that we'd like to discuss as well. And as mentioned, we can already linked WASM with a synthetic or third-party static module record in the compartments proposal as written, but that does not necessarily solve - but the compartments proposal does not necessarily to reify that synthetic module record, we could in a complimentary amendment be able to take a host's wasm static module record, which is not reified and pass it to another compartment. Moddable XS's compartments actually depend upon this feature, because they never went in compiling JavaScript for an embedded system's ROM. They never reify the static module record and they exclude the sources from the ROM. So you just get compiled JavaScript in the ROM, but they can still use compartments to pass the static module records from compartment to compartment, which is very useful for their needs as well. And compartments also answer the question of asset reflection with synthetic static module records. And so yeah, again to re-emphasize, for the portions of this proposal that overlap compartments we would really very much like to join our efforts and we're going to attempt to present and request stage 2 for compartments at the next meeting. I believe toward the end of July. And that's what I've got. Thank you.
GB: If I could just respond to that briefly, this is something that we would be hoping to be able to get stage progression soon for, as well. As mentioned module reflection is primarily the mechanic of reflection. And certainly, I think There's some some really interesting collaboration work and I look forward to working with you on that Kris. What this brings up in these cross-cutting concerns is, I think, firstly, the primary kind of forcing function of reflection being that it kind of makes this stuff static in the module system and gives you this these static security properties that we kind of need today for wasm. And then secondly, it's something which - there are certain constraints on what we need from source text module records and these module instances in order to be able to interop with this kind of a model. And so I think using this illustrative example to say these are the constraints and this is these are the static guarantees we need. But then yeah, if we can put our heads together and work out how expose the layer that kind of reflection. That would be great.
KKL: Absolute agreement. I think that there's a great deal of value to be seen to come from collaborating on this. And this was a fantastic way of expressing the The requirements that you're trying to solve. Notably one thing if in the layering it may be possible to reduce the scope of import reflection to just declaring a non-executed dependency without actually reifying anything necessarily since the compartments proposal allows you to construct a compartment, mentioning a static module record held by the host compartment by name instead. and so there are some again. I think our proposals are highly complementary that and that they're both more likely to advance more quickly with us working together.
GB: Yeah, it's worth mentioning as well the reason we brought out the whole JS reflection is because that's what we were asked to investigate at the last meeting and we were just showing that this stuff is complimentary and yeah, I think there's there's some exciting stuff to work on.
JWK: I'm interested in the previous slide that mentioned to manually link modules. I'm wondering, what's can we get the benefit from it? In what case do you want to manually link the module graph?
GB: Yeah, I can take that one. So the main benefits in having this static import of a reflection is that the module resolver is entirely in control over which module loads. The user has access to when you create a module to record from custom source, you can think of that as like an eval function, if you're generating a source text from a custom source or a fetch Source It's something like an eval, and by actually reifying that pattern all the way up to getting back a source record through the import resolver we are cutting out that eval capability. So the host is still in control over what modules are executing and so you wouldn't need to be under an arbitrary source text evaluation security policy.
JWK: in this slides, you will link the dep instance to mod instance, which means you can actually specify a module instance to give as the import results of the mod.js, is that correct?
GB: so, in this example, I'm imagining that the module at mod.js has an import 'dep' as a bare specifier as its only dependency and that dep.js has no Imports itself. In our case. when first that import is made as a reflection at the top, it will see that. It's the dep’s import that is imported separately because it doesn't matter what it resolves to since you separately, import mod, import dep, and only then you connect those two modules together. And as long as you've linked up the whole graph, the evaluation will then perform a spec compatible evaluation.
JWK: Okay. Thanks. I understand. What if the mod.js has a dynamic import?
GB: Yeah, so you would need to expose that somehow, and this example doesn’t cover that right now.
JWK: I think this is interesting. You don't need to have compartments to link those modules together. You can create a new instance and just evaluate those parts. yeah, I think we need to have a more unified API between all those modules related proposals. Thanks for clarifying.
DE: I think this is a really important area. I agree with JWK that there's a lot of intersection with other places in addition to compartment side i'd mention module blocks. I mean, this was mentioned in the presentation, multiple blocks, and the wasm js api. Probably we should have some kind of unified view of looking at Imports and the way that modules link together, I think asset references are also really important. I think, when asset references were brought up initially, the motivation was couched as, this is useful for tooling and there was some skepticism about, well, what does that mean for the web? But I think asset references would be really useful for the web to enable prefetching of assets. That might be conditionally or later linked or, or not load block. And this would also be useful for JavaScript code. I mean, for dynamic, import itself. There a possibility that dynamic import would be for things that you statically declare, but then are loaded later through code splitting. And lots of bundlers today have different kinds of syntaxes for dynamically loading doing code splitting, but then indicating, that something should be prefetched which does also seem something that would make sense to do natively in a browser. Of course. There are other things that we need to be solved to make modules actually work in browsers, like bundling, but there are efforts underway for those as well. So I think this is all really important area and there's maybe more orthogonality here than might initially appear. In my opinion, as you showed here, module reflection makes sense on JavaScript modules and and then I would add, asset reference also make sense on JavaScript modules. So I think this is a this is a great area to make more progress on. going to Stage 2 at some point soon sounds good hopefully but I think for stage 3 to work out all the, all details will take some more months of collaboration across different areas. One change that I would make to take these proposals beyond this main task of unifying the interface for reflected modules among the In various different kinds would be, rather than using “as”, use key value pairs. So such that, it could be extended by the host and in different ways for the exact same reasons we did for import assertions. There are threads going on in HTML right now about possible different attributes. That might be added to modules, and I think, the core representation will not be the only one. I understand if there was pushback in the past to, you know, evaluator attributes maybe being too general, but we should make it clear that that such attributes are always handled by the by the module loader, by the environment, and that they're not containing things like functions that parse the JSON or something like that, these are always handled by the environment, but nevertheless that we have an extensible architecture because we've already seen multiple different kinds of ways that a module might be interpreted as, or exposed as, being relevant. So I think a single string is kind of too limiting for that. And we had a discussion with import assertions about how we don't want to be kind of too general. But I think now that we're seeing something that's really well motivated for making the actual import be something different based on what you pass as the string. We should rethink the constraints to find something that we're happy with. I look forward to working together with you on these sorts of things.
GB: So firstly, thanks DE. It's great to hear that the stuff like roughly can align with something that can be collaborated on between different, you know, different goals. And that's very much what we've been trying to show in this presentation is that reflection is something that can work with other proposals and that is a good proposal to layer with other proposals, and it can achieve use cases that we want to achieve with modules and that it's not closing doors. It is a strong position of the reflection proposal that reflection is very specifically reflections of the asset as interpreted by the module type, that it's resolved as in the host resolver, and that idempotent property at that asset level. And then firstly, given an asset and a reflection type, is the entire idempotency property of the resolver and that is very fundamental, like an important part of this proposal. Not to preclude any work on evaluator on custom evaluation options, but that reflection in its current form treats that reflection string as a primary primitive in that idempotency property. Yeah, for that reason, I wouldn't want the reflection string to be at another level with other types of options.
DE: I think all the options need to be idempotent. I agree with you. I don't see why that precludes adding other options.
GB: It gets a little harder to define the idempotency property when you have multiple bags of options. Already with the reflection proposal there's even the question of if the string is giving too much power to hosts, or if we need to lock it down even more. I think allowing other arbitrary options to then interact with the idempotency property, starts to get difficult to define. And one of the difficult things about that is to make that as a requirement without justifying use cases I think is a difficult thing to do. This proposal in its current form is entirely, you know, we've focused entirely on these different use cases and specific use case interactions, and we've yet to see an example where there is something that uses all looking to do that. They that they wouldn't be able to achieve or that this would it would hinder not to say that that other options couldn't be added but that reflection does have a special place in the system.
DE: Yeah, I've linked to other requirements and mention them. Like there's this prefetching, there's this in the HTML issue tracker there's a discussion about render blocking. I'm really skeptical about limiting it to one thing. And we currently already have Requirements in the in the JavaScript specification things Exactly Like this about, about idempotence in the host hooks. So I really don't understand what's difficult here. We could say we don't we don't trust the hosts to follow the restrictions that we put in the specification, but I don't - it's hard for me to see why that would be.
GB: It's difficult to define uniqueness of identity for a bag of host defined values. It's very difficult thing to define. And as I say just it would help to see driving use cases. I've yet to see an example of a use case.
DE: So, I think we should continue this discussion offline. I think there's a lot to talk through here about how the proposals interact, the the generalization to a key value bag is something that I feel really strongly about and if you don't find the use cases that I'm giving persuasive than we should, we should talk further about that. But there's just, there's just a lot to talk through here.
GB: Yeah, my point is not that there are no use cases for options bags, but that those use cases don't justify unifying reflection with custom evaluation options. My point is more, that reflection is something that has this special property in the resolver. That is the fact that we have the special reflection idempotence. That would be difficult to extend to a custom option system for which that extension I don't see that the use case justifiy from the current position. Certainly, that's that's discussed through those use cases further, but I just wanted to make that position clear.
DE: Yeah, I mean, in the current JavaScript specification, or set of stage three things, there's already requirements on import assertions on certain assertion keys. So I think all this is specifiable. Maybe we don't want to make that trade-off.
LCA: and I think that if I can jump in here real quick, I think one thing we've seen with asset with assertion specifically. Is that even though the These these limit or not limitations these requirements exist in the spec that make certain things must be idempotent, and even if engines like V8 or JC or similar Implement these correctly, it is difficult to get the entire ecosystem to understand that these requirements exist. And they will often want to bypass them because look, there's an options bag on my input now, and I'm bundler, and I went to put an option on my import, and even though that is technically, not something we wanted to support with import assertions. It is something that we have seen over the last year or so that people have started wanting to do with these assertions and I would be rather skeptical of giving people more ways to misunderstand. what these assertions are for or what? but these reflection attributes are for. But yeah, we should probably discuss this offline.
DE: Yeah, I sounds like we're already living within this bad world. I agree with the comment that bakkot made on the chat. But, yeah, we can keep talking.
JRL: I was just going to agree with DE's statements there, I want a key Value Store. We're discussing it right now in the matrix chat, but essentially. I want to use the fact that I can put anything into this evaluator of the bundler evaluates, the modules, that the link or link to act differently, or included in my source traffic. If I don't imagine, we are going to approve every use case, particularly. He liked asset reference. Like maybe I want to do it differently, having a key value pair allows me to express everything that I want in the source code.
DE: so, the use case being bundler specific tooling hints.
GB: We should definitely discuss some of the stuff further. But yeah, I think we've discussed this topic, you know.
SYG: I am in support of the current proposal. I mean. In principle, I agree with that. We should not be limiting syntax because bundlers will abuse it. but, like the signal you get from bundlers abusing a particular syntax are, it could be that they want to use something. But most of the time it's because they have a pressing need that their have surfaced or whatever and they want to do it and they look for the closest fit and they in this case they decided on import assertions and the main signal I get from that is just that there's a real need for it and if reflection solves that for the asset use case, in addition to the while some use case, I mean, these are very important, use cases as DE said. So I would I guess what I'm saying is that there seems to be some urgency in this space That is not present in, in some of the other Other module proposals so far. Like, we could go down the road of assertions, being more abused, but if we have independently, motivated reasons to kind of give them what give them a space that they can actually use without violating the spirit of the spec. That seems good too. It's not clear to me currently with the point that DE brought up that the key value pairs is a prerequisite for this to move forward, Is that what you were saying DE?
DE: I didn't phrase it like that, but I prefer that way.
SYG: Okay. Yeah, I also am not. I'm not entirely clear on the on Why That is so important I guess.
DE: for the for the exact same reason we did it for import assertions.
SYG: Okay. So so the the idea is basically like reflect the, the options bags that are already part of import assertions, to the, the reflection proposal.
DE: So I think we would use a different offsetting keyword because the "assert" keyword already indicates that. But for dynamic import we just have the The second argument to Dynamic, import passed directly is the importa assertions that has like a separate layer of keys. Maybe that was over design and we should have just gone with a single layer in the first place, but I think we should just do this analogously to that in terms of syntax and integration with dynamic import and everything like that. Just it would make things simpler and more extensible.
GB: Yeah, as I say like I would still really value firstly seeing use cases because I think a lot of this stuff is like build metadata is not something that should be a goal of TC39, you know, or maybe it gets us into typing discussions. But yeah, evaluator should be runtime things or custom evaluation options should be runtime things. And as runtime things they need standardization between hosts and if users are going to put things in there that aren't standard or that aren't going to work between different hosts then I don't understand how that's going to play out practically because it's very difficult to see a scenario where it doesn't just lead to fragmented support. And because this stuff is static at the module level, that if something doesn't support it, it's just not going to run. So I struggle to see the value proposition in a world where these things create semantic changes that are needed by users without having a clear example of a use case that can be supported at run time between all the hosts and that's going to provide a clear and coherent use case. And again, that's not to say that I don't think that's a useful thing to explore, but bundling it with this reflection proposal in making it a dependency on and saying that reflection proposal needs to address this other completely unrelated problem when reflection is very specifically has this clarity and ability to get convergence between hosts.
DE: So, so let me repeat a use case that I believe in strongly which is prefetching and pre-loading. there are a few different things on the web that you'd want to pass for that. You would want to pass, you may need to pass certain parameters that that get used by fetch; in environments that are not fetching over the network these these can be safely ignored, I think assets that are that are prefetched - you may also pass a priority. I think assets that are that are prefetched is just a very useful feature that we're missing from ES module loading right now and that this proposal has the potential to fill in but only if more parameters can be passed through.
GB: Yeah, okay, LCA it could be worth bringing up the asset references but let's continue this offline.
JRL: The as "asset-reference" here essentially is, we're creating a brand new area that can define a different evaluation for the module. But we already have this. We just badly named it. We when we were deciding on a assert, so that we could import CSS or HTML or wasm, we decided to call it assert because we made an arbitrary restriction that we can't change the evaluation of the module based on the assert statement, but now we know, we need to change the evaluation of the module to give us something different. You don't just want an instantiated wasm module, we want the option to have an uninstantiated module. I'd rather - personally I would rather that we don't create a new as X or whatever else. We just put it into the thing we already created with assert. We remove the arbitrary limit that we can't change evaluation in assert. It was unnecessary to begin with and we did it just so that we could get it shipped quickly, but now it just creates confusion between where you're supposed to put the two different things. As a user, I don't want to know or care that something is an evaluator attribute or an assert attribute. I just want to put it in my import and I get an output.
JHD: It was not arbitrary. It was explicitly done to intentionally restrict it to assertions about the module itself. The need for these evaluator attributes was very much known at the time and was explicitly split up as the only way we could ship it at all. It's not about shipping it quickly. I don't know if I was the only one, but I certainly objected to anything that allowed an evaluator attribute to be in the same place as assertions. I think that as a user you should and must know and care whether you are making an assertion about the module, like that it's JSON, versus when you're trying to say “I want to import a different form than the module normally would have”. It was named that way explicitly to forever prevent anything but assertions from going in it, and that's not something we should change.
RPR: Okay, I'm gonna point out we're actually over time on this. We've got a little bit of extra so we do another four minutes, but please keep responses short on this.
MAH: Really quick. There is something I don't fully understand so far. We seem to be mostly talking about the use case of loading an asset, whether that's the JavaScript module, a wasm module, or a PNG, but not evaluating it, keeping it in some kind in inert form. If that's the only use case, do we need as something or can we have a Syntax that just caters to that use case. And for example, like import static or something like that.
LCA: Yes, this is technically possible. The I want to disagree with the statement that the only thing we presented was the the static import, because asset references are distinct from the distinction between instantiated uncertainty of modules because as a references are not just on instantiated, they're also unloaded. So that's one further distinction and not all use case. There are some use Cases, which we haven't presented. I have some slides on the back of the deck here.
MAH: What do you mean by unloaded exactly?
LCA: unloaded as that it's this. This PNG is not actually loaded aat the time of the import, rather it is this the asset reference is a - this is easier to explain with a concrete example. So on the web, if you would import this PNG, the host resolver would resolve foo.png and return the resolve specifier string and wrap this inside of a inside of an object which is unforgeable and frozen, and then you can use this object instead of a URL and things like fetch or import to then later import that asset. And there's also another use case, which is having an alternative reflection for something like JSON Imports. For example, where we want a read-only version of an import for example which returns a records/tuples representation rather than a mutable version like it does now. And that also doesn't really fit into either of those. That does not mean we are opposed to completely opposed to this being something like showed in TCQ where it is a static word is a keyboard.
YSV: I just want to voice support for this again. I voiced support when it was presented last time. It's interesting to see how this is evolving. I do have a little bit of a concern about exposing linking, but I also understand why you might want to do that. But I had when I was thinking about deferred module evaluation, fully exposing the module API to the user was something that I considered and I was warned that this may lead to some error prone decisions on the side of users. And I guess that this also goes into the direction of the conversation about bundlers having assets reflected so that they can make decisions at compile time. I don't really have much to say about that. I'm curious to see where this goes.
SYG: <Queue: “Can we get some concrete AIs for stakeholders? As I said I sense some ecosystem urgency.”> I've heard some concrete concerns from Dan about the key-value thing, some ongoing concrete concerns about intersection with other proposals in the module space. As I said, I sense some ecosystem urgency here with the already abuse of assertions and continue to abuse assertions, I suppose. I would like us to get agreement on some concrete action items for the stakeholders. and all this space. Instead of, you know, know, GB and LCS coming back trying to make progress in them, people raising concerns and then Rinse, and repeat. Can we get some kind of agreement or commitment that the stakeholders will, you know, try to make progress or bump this on their list of priorities?
DE: As a champion of multiple blocks, which I think is kind of a stakeholder proposal. I'm going to be coming to the SES calls as well. I think we probably have so much to discuss that we should set up regular meetings to move these this whole set of proposals forward, whether that's the SES call or separate new regular call. And you know, either way I'm happy with that.
SYG: seconded to a seperate call from the SES call. Okay, then I will either volunteer dan or or volunteer myself. Let's gather on the reflector to set up a separate regular working session for all this module stuff that would be great.
GB: Thanks for all the input.
Conclusion/Resolution
- Proposal is not advancing at this time
- SYG or DE to open an issue on the reflector to set up a call with stakeholders
Incubator calls
- array.fromAsync
- decorator metadata
- bigint math
- bindThis
- pipeline