Virtual Module Sources
October 20, 2022 · View on GitHub
Synopsis
Extend the Module constructor such that it accepts virtual module sources:
objects that implement a protocol that is sufficient for virtualizing the
evaluation of modules in languages not anticipated by ECMA-262 or a host
implementations.
Motivation
Hosts can add support for linking non-EcmaScript languages by creating
new kinds of Module Source Record.
For example, a host could add a [[ModuleSource]] internal slot to
WebAssembly.Module and then link arbitrary WASM into
an EcmaScript module graph.
However, user code cannot emulate a host in this way. To extend or experiment with new languages or new module-like features, user code needs a way to emulate or virtualize a module source record.
Such a protocol would allow user code to integrate CommonJS, JSON, or WASM modules on hosts that do not. It would also open a field for experimentation with other kinds of resource modules.
Description
The first argument to the Module constructor is a source.
If that source is an object that does not have a [[Module Source]] internal
slot, we treat this object as a protocol for loading, binding, linking,
initializing, and executing the new Module instance.
Interface
type VirtualModuleSource = {
// Indicates the import and export bindings the module has
// between its module environment record, module namespace exotic object,
// and its dependencies.
bindings?: Array<Binding>,
// Executes the module if it is imported.
// execute may return a promise, indicating that the module uses
// the equivalent of top-level-await.
execute?: (namespace: ModuleImportsNamespace, {
import?: (importSpecifier: string) => Promise<ModuleExportsNamespace>,
importMeta?: Object,
globalThis: Object,
}) => void,
// Indicates that execute needs to receive a dynamic import function
// bound to a Module instance.
needsImport?: boolean,
// Indicates that initialize needs to receive an importMeta.
needsImportMeta?: boolean,
};
Bindings must be one of the shapes proposed in module source static analysis, such that for each binding gets linked in a virtual module.
The Module constructor from Module and ModuleSource extends to
accept virtual module sources instead of ModuleSource and reflects whatever
source its given as the source property of the returned instance.
Examples
JSON
This protocol allows user code to create new module source constructors. For example, for a host that does not support JSON, this could be accomplished with a small virtual module source.
class JsonModuleSource {
bindings = { export: 'default' };
constructor(text) {
// Throw SyntaxError if the source is invalid, from here.
this.#object = JSON.parse(text);
};
execute(imports) {
// Exports of multiple module instances backed by this
// source should be referentially independent.
imports.default = clone(this.#object);
};
}
const source = new JsonModuleSource({ meaning: 42 });
const module = new Module(source);
const { default: { meaning } } = await import(module);
Asset modules of various kinds would largely follow this pattern.
WASM
On a host that does not provide support for WebAssembly, a virtual module source would suffice.
class WasmModuleSource {
constructor(buffer) {
const module = new WebAssembly.Module(buffer);
this.#imports = WebAssembly.Module.imports(module);
this.#exports = WebAssembly.Module.exports(module);
this.bindings = [
...this.#imports.map(({ module, name }) =>
({ import: name, from: module })),
...this.#exports.map(({ name }) =>
({ export: name })),
];
};
async execute(namespace) {
const importObject = {};
for (const { module, name, kind } of this.#imports) {
importObject[name] = namespace[name];
}
const instance = await WebAssembly.instantiate(module, importObject);
for (const { name } of this.#exports) {
namespace[name] = instance[name];
}
};
}
CommonJS
There is no, singular perfect solution for binding CommonJS, especially taking into account that an asynchronous loader cannot immitate Node.js's synchronous loader. But, any Node.js library that is portable to the web must be sufficiently transparent to transparent analysis that bundlers can subsume them, and so, large amounts of the CommonJS ecosystem are in fact portable.
This is a sketch of one possible solution for binding CommonJS in ESM, based on the solution in Endo.
CommonJS virtual module source based on heuristic static analysis.
class CjsModuleSource {
constructor(source, url) {
this.source = source;
// Lexical analysis of a CommonJs module reveals the bindings
// for named exports and invents bindings for imported namespace
// objects for every lexically evident require call with a string argument,
// such that require just returns a property of the
// ESM internal namespace to the already-linked imported module
// namespace.
const { bindings, requires } = lexicallyAnalyzeCjs(source, url);
this.bindings = bindings;
this.#requires = requires;
};
async execute(namespace, { importMeta, globalThis }) {
const functor = new globalThis.Function(
'require', 'exports', 'module', '__filename', '__dirname',
source, // Inject source URL here (caveat Github Markdown)
);
namespace.default = Object.create(globalThis.Object.prototype);
// Set all exported properties on the defult and call namedExportProp to
// add them on the namespace for import *.
// Root namespace is only accessible for imports.
// Requiring from CommonJS gets the default field of the namespace.
const promoteToNamedExport = (prop, value) => {
// __esModule needs to be present for typescript-compiled modules to
// work, can't be skipped.
if (prop !== 'default') {
namespace[prop] = value;
}
};
const originalExports = new Proxy(namespace.default, {
set(_target, prop, value) {
promoteToNamedExport(prop, value);
namespace.default[prop] = value;
return true;
},
defineProperty(target, prop, descriptor) {
if (has(descriptor, 'value')) {
// This will result in non-enumerable properties being enumerable
// for named import purposes.
promoteToNamedExport(prop, descriptor.value);
}
// All the defineProperty trickery with getters used for lazy
// initialization will work.
// The trap is here only to elevate the values with namedExportProp
// whenever possible.
// Replacing getters with wrapped ones to facilitate
// propagating the lazy value to the namespace is not possible because
// defining a property with modified
// descriptor.get in the trap will cause an error.
// We use Object.defineProperty instead of Reflect.defineProperty for better
// error messages.
Object.defineProperty(target, prop, descriptor);
return true;
},
});
// Machinery to distinguish module.exports assignment.
let finalExports = originalExports;
const module = freeze({
get exports() {
return finalExports;
},
set exports(value) {
finalExports = value;
},
});
const require = specifier => {
return namespace[this.#requires[specifier]];
};
functor(require, moduleExports, module, filename, dirname);
// Promotes keys from redefined module.export to top level namespace for import *
// Note: We could do it less consistently but closer to how node does it if
// we iterated over exports detected by the lexer.
const exportsHaveBeenReassigned = finalExports !== originalExports;
if (exportsHaveBeenReassigned) {
namepsace.default = finalExports;
keys(namepsace.default || {}).forEach(prop => {
if (prop !== 'default')
namepsace[prop] = namepsace.default[prop];
});
}
};
}
Pass-through module sources
This example illustrates how a virtual module source can simply reexport another module with no special logic in an executor.
import * as direct from 'real.js';
const source = { bindings: { exportAllFrom: 'real.js', as: 'real' } };
const module = new Module(source);
const { real: indirect } = await import(module);
direct === indirect; // true
Dependencies
This proposal depends on Module and ModuleSource from the Compartments
proposal to introduce ModuleSource, and ModuleSource
analysis from the Compartments proposal.
Design
For every Module instance that has a virtual module source,
module import machinery constructs a real Module Imports Namespace,
an exotic object that allows user code to get and set the values for each
binding.
Bindings must be one of the shapes proposed in module source static analysis, such that for each binding gets linked in a virtual module:
-
{ import: string, from: string }Links the named
importproperty of the Module Imports Namespace to the same name of thefrommodule's Module Exports Namespace. -
{ import: string, as: string, from: string }Links the named
asproperty of the Module Imports Namespace to theimportname of thefrommodule's Module Exports Namespace. -
{ export: string }Links the named
exportproperty of this module's Module Imports Namespace directly to its Module Exports Namespace. -
{ export: string, as: string}Links the named
exportproperty of this module's Module Imports Namespace to the namedasproperty of this module's Module Exports Namespace. -
{ export: string, as: string, from: string }Links the named
exportproperty of thefrommodule's Module Exports Namespace to the namedasproperty of this module's Module Exports Namespace, bypassing this module's Module Imports Namespace entirely. -
{ importAllFrom: string, as: string }Links the Module Exports Namespace of the module
importAllFromto the nameasin this module's Module Imports Namespace. -
{ exportAllFrom: string }Links all of the names except
defaultexported by theexportAllFrommodule to this module's Module Exports Namespace, bypassing this module's Module Imports Namespace entirely. -
{ exportAllFrom: string, as: string }Links the Module Exports Namespace of the module
exportAllFromto a property namedason this module's Module Exports Namespace, bypassing this module's Module Imports Namespace entirely.
In the absence of a bindings property, module machinery presumes an empty
bindings array.
Modules without bindings may still have side-effects in global scope.
Dynamic import of a Module with a virtual module source induces the memoized
importHook of the Module for each binding that has an from property.
When a Module instance exists for every transitive dependency, dynamic import
advances the linkage to all new namespaces, including all links between
module [exports] namespaces and module import namespaces according
to each source's bindings.
In the absence of an execute property, dynamic import assumes an empty
execute function.
Modules without execution behavior may still have useful export bindings from other modules.
Dynamic import will then execute the working set of modules according
to the existing rules of ordering, and will call execute for each virtual
module source, providing the linked Module Imports Namespace,
a dynamic import function bound to the module instance only if the virtual
module source has a truthy needsImport property, and an import.meta object
only if the virtual module source has a needsImportMeta property.
Design questions
Module source serializability invariant
Caridy Patiño prefers an even lower level API for virtual module sources,
where instead of virtualizing evaluation, we provide a way to construct
a Module instance along with its imports and exports namespaces from their
bindings.
In this model, there would be no virtual module source, just modules.
This would protect the invariant that module sources are serializable.
Emulated JavaScript
Should virtual module sources support emulated JavaScript? That will require, in some cases, separation of initialization from execution as separate phases. Should the virtual module source protocol support separate paths for modules that do not require an initialization phase?
Shape of the internal namespace object
The execute method of a virtual module source needs access to both
the global object and internal view of bindings.
These could be addressed singly by a reification of the module environment
record, or with separate objects as written.
Future amendments to modules may eventually also add lexical names to the
module environment record that are not properties of the global object.
Limitations
Module sources compiled by the ModuleSource constructor capture enough data
that engines can transfer them in many ways.
The internal representation of a module source can be an immutable record that
a JavaScript host can trivially share throughout an agent cluster.
Communicating JavaScript agent clusters can transfer module sources as data.
In the most naive case, a module source is just a record of the original source
and its type, which is not limited to JavaScript modules, since hosts can define
additional module source types.
For example, WebAssembly.Module might be trivially
extended to serve as a module source, by adding a [[ModuleSource]] slot
referring to a concrete Module Source Record.
In this case, it is sufficient for a pair of JavaScript agent clusters
to transfer the original source and type.
In a more elaborate case, the module source does not even retain the original text, but its compiled bytecode. In this case, a pair of compatible JavaScript agent clusters might elect to send and receive byte code.
Virtualized module sources are transmissible only between JavaScript agent clusters if both the sender and receiver agree on a protocol for constructing a new virtual module source from some serial representation.
To wit, it would not be possible for a host to transmit arbitrary virtual
module sources between agents using a general purpose algorithm like
structured clone.
The protocol for communicating virtual module sources between hosts would
necessarily need to be implemented in user code.
Any general-purpose host-defined mechanism for transmitting module graphs would
necessarily fail when encountering a virtual module source.
For example, it may be possible to extend structured clone to transmit
instances of ModuleSource between any JavaScript hosts, or even
WebAssembly.Module when a host-defined extension is
present, but it would not be possible for this behavior to generalize to
virtual module sources.