Refactor of import-related host hooks
August 23, 2022 · View on GitHub
Proposed changes
ECMA-262 currently exposes two hooks related to modules loading: HostResolveImportedModule and HostImportModuleDynamically.
HostResolveImportedModule(referencingScriptOrModule, specifier) synchronously resolves an imported module and returns the corresponding module record. While module resolution and loading is usually asynchronous, this was a good enough abstraction for the ES2015 modules specification: before evaluating a module, hosts could pre-build the module graph before evaluating a module and asynchronously load all its dependencies. This asynchronous step was not observable from ECMA-262, whose algorithms where only run once all the dependencies where synchronously available.
When we introduced dynamic imports in ES2020, this abstraction leaked: the asynchronous loading part needed to run during the execution of other ECMAScript code, so we had to introduce the new host hook HostImportModuleDynamically(referencingScriptOrModule, specifier, promiseCapability) to give hosts the opportunity to asynchronously prepare for the synchronous HostResolveImportedModule calls.
When loading and evaluating modules, either using host-defined mehanisms such as <script> tags or when using HostImportModuleDynamically via dynamic import, the complete algorithm is divided between the host and ECMA-262:
- (host, potentially async) Load the module graph:
- (host, potentially async) Load the Module Record.
- (host) Get all its static dependency specifiers.
- (host) For each dependency, do 1.i.
- (host) Call
.Link()on the top-level Module Record:- (ECMA-262) Get all its static dependency specifiers.
- (ECMA-262) For each dependency:
- (ECMA-262) Call
HostResolveImportedModule(specifer, referencingModule). - (host) Get the pre-loaded module curresponding to the
(specifer, referencingModule)pair. - (ECMA-262) Validate that the imported bindings are actually exported.
- (ECMA-262) Do 2.i for the resulting module.
- (ECMA-262) Call
- (host, potentially async) Call
.Evaluate()on the top-level Module Record.
This refactor proposal aims to revisit the layering decision made by the dynamic import proposal: rather than introducing a new hook to permit async host steps for import() calls, it replaces HostResolveImportedModule with an equivalent but async-compatible HostLoadImportedModule hook: it loads a single module, and ECMA-262 iterates through its dependencies asking to the host to load them. The updated algorithm is:
- (host or ECMA-262, potentially async) Load the module graph:
- (ECMA-262, potentially async) Call
HostLoadImportedModule(specifier, referencingModule). - (ECMA-262) Get all its static dependency specifiers.
- (ECMA-262) For each dependency, do 1.i.
- (ECMA-262, potentially async) Call
- (host or ECMA-262) Call
.Link()on the top-level Module Record:- (ECMA-262) Get all its static dependencies.
- (ECMA-262) For each dependency:
- (ECMA-262) Validate that the imported bindings are actually exported.
- (ECMA-262) Do 2.i for the resulting module.
- (host or ECMA-262, potentially async) Call
.Evaluate()on the top-level Module Record.
where host or ECMA-262 means "host if the algorithm is run by an host-defined mechanism such as <script> tags, ECMA-262 if it's run by dynamic import".
Motivation
This refactor has two benefits on its own: it reduces the amount of behavior delegated to the host, by taking ownership of the loading steps shared across all the hosts that use asynchronous loading.
However, it's most useful for some current proposals that introduce the concept of a "module whose dependencies have not been loaded yet":
- Module Blocks and Compartments need a new host hook to "load the dependencies of an already loaded module" (
HostImportModuleRecordDynamically); - Import Reflection needs a new host hook to "load a module without loading its dependencies" (
HostResolveModuleReflection).
HostLoadImpotedModule is low-level enough that it already satisfies those use cases:
- Module Blocks and Compartments can recursively call
HostLoadImportedModuleon the dependencies of the unlinked module; - Import Reflection can use
HostLoadImportedModuleto load a single module without recursively loading its dependencies.
This refactor reduces the number of loading-related host hooks from 2 to 1, and prevents it from growing to 4 in the future.
Constraints
This refactor should not force module loading to be asynchronous:
- For hosts that currently load modules synchronously, forcing a promise tick for each recursive dependency might cause a big performance regression.
- Some hosts (for example, Bun) allow synchronously importing modules that don't use top-level await, by relying on the fact that
module.Evaluate()returns a resolved promise and synchronously inspecting is value.
With this refactor both are still possible: HostLoadImportedModule can synchronously give control back to ECMA-262 (reusing the same logic they had in HostResolveImportedModule) to synchronously continue the loading process. The new module.LoadRequestedModules() will then return a resolved promise.
Hosts can still implement synchronous import of modules:
- Load the module.
- Call
module.LoadRequestedModules(), which returns aloadPromise. - If
loadPromise.[[Status]]isrejected, throw; otherwise it'sfulfilled. - Call
module.Link(). - Call
module.Evaluate(), which returns aevalPromise. - If
evalPromise.[[Status]]isrejected, throw. - If
evalPromise.[[Status]]ispending, throw (it's using top-level await). - Return
GetModuleNamespace(module).
Hosts integration
Hosts can use these new ECMA-262 algorithms in two ways.
This is the most straigthforward integration is to keeps the loading algorithms used for HostResolveImportedModule/HostImportModuleDynamically.
Assuming that the old hooks are implemented as follows:
- HostResolveImportedModule(
referencingScriptOrModule,specifier):- Let
fullSpecifierbe the result of resolvingspecifierfromreferencingScriptOrModule(for example, via URLs). - Return the already loaded Module Record corresponding to
fullSpecifier.
- Let
- HostImportModuleDynamically(
referencingScriptOrModule,specifier,promiseCapability):- Let
fullSpecifierbe the result of resolvingspecifierfromreferencingScriptOrModule(for example, via URLs). - (async) Let
modulebe the result of loading and parsing the source code corresponding tofullSpecifier. - (async) Recursively load and parse the dependencies of
module. - Call
module.Link(). - Let
evaluationPromisebe allmodule.Evaluate(). - Call FinishDynamicImport(
referencingScriptOrModule,specifier,promiseCapability,evaluationPromise).
- Let
The new host hook would be implemented as follows:
- HostLoadImportedModule(
referrer,specifier):- Let
fullSpecifierbe the result of resolvingspecifierfromreferencingScriptOrModule(for example, via URLs). - If a Module Record corresponding to
fullSpecifierhas already been loaded, then:- Let
modulebe such Module Record. - Return
module.
- Let
- (async) Let
modulebe the result of loading and parsing the source code corresponding tofullSpecifier. - (async) Recursively load and parse the dependencies of
module. - Call FinishLoadImportedModule(
referrer,specifier,module).
- Let
A more advanced refactor would avoid step 3. of the above HostLoadImportedModule implementation, and fully delegate the dependencies discovery algorithm to ECMA-262. Hosts should carefully consider the differences between the ECMA-262 algorithm and their own before doing so.
Is this normative or editorial?
This proposal changes the number of spec-defined promise ticks when successfully importing a module with import("foo").
Has "foo" already been imported? | Is "foo" a Cyclic Module Record? | Old number of ticks | New number of ticks |
|---|---|---|---|
| Yes, from the same module | Yes | (host-defined ≥ 1) + 1 | 2 |
| Yes, from a somewhere else | Yes | (host-defined ≥ 1) + 1 | (host-defined ≥ 0) + 2 |
| No | Yes | (host-defined ≥ 1) + 1 | (host-defined ≥ 0) + 2 + (Eval ≥ 0) |
| Yes, from the same module | No | (host-defined ≥ 1) + 1 | 2 + (Eval ≥ 0) |
| Yes, from a somewhere else | No | (host-defined ≥ 1) + 1 | (host-defined ≥ 0) + 2 + (Eval ≥ 0) |
| No | No | (host-defined ≥ 1) + 1 | (host-defined ≥ 0) + 2 + (Eval ≥ 0) |
- (host-defined) represents the number of ticks that the host needs to load the module and its dependencies. With the old behavior it's the time needed by the host to call
FinishDynamicImportand to resolve itspromiseargument; with the new behavior it's the time taken by all theHostLoadImportedModuleexecutions to callFinishLoadImportedModule. I assume that the "at some future time" mention in theHostImportModuleDynamicallydescription implies that it takes at least 1 tick. - (Eval) represents the number of promise ticks used by the
.Evaluate()method of module records.
Open questions
Should module.LoadRequestedModules() live on Abstract Module Record or Cyclic Module Record?
ECMA-262 only has the concept of dependencies for Cyclic Module Records, but this method makes sense also for other Module Records that have dependencies not exposed to ECMA-262.
Is it possibe to use a single method that does both module.LoadRequestedModules() and module.Link()?
module.Link() uses Tarjan's algorithm to find SCCs in the modules graph and transition their elements' status from linking to linked at the same time. This algorithm tracks SCCs using a mutable stack where it pushes/pops Module Records while traversing the graph.
This approach doesn't work with module.LoadRequestedModules(), because it visits multiple paths of the graph concurrently: a mutable stack would cause race conditions in the detection of different SCCs. For this reason, module.LoadRequestedModules() transitions the modules status from new to unlinked after loading the whole graph.
- Is it important that module.Link() transition the status from
linkingtolinkedas soon as possible? - Is there an efficient alternative to Trajan's algorithm that works with concurrent traversals?