September 23, 2021 Incubator Call Notes (WebAssembly/JS interop)
September 30, 2021 ยท View on GitHub
Attendees:
- Dan Ehrenberg (DE)
- Shu-yu Guo (SYG)
- Asumu Takikawa (ATA)
- Sergey Rubanov (SRV)
- J.S. Choi (JSC)
- Richard Gibson (RGN)
- Lars Hansen (LHN)
- Ross Tate (??)
- Keith Miller (KM)
- Shane Carr (SFC)
- Mathieu Hofman (MAH)
- Jakob Kummerow (JKW)
- Igor Sheludko (??)
- Svyatoslav Kuzmich (??)
- Zalim Bashorov (??)
- Philip Chimento (PFC)
- Dan Finlay (DJF)
- Igor Laevsky (??)
- Bradley Farias (BFS)
- Guy Bedford (GB)
- Ioanna Dimitriou (IOA)
WebAssembly/JS API
DE: (Explains high-level overview of what WebAssembly is.)
DE: In addition to the JS API there's also the Web API that provides streaming versions of compilation APIs.
DE: (Presents topics laid out in the agenda.) There's a number of additions to the Wasm/JS API we've already made. One I worked on was BigInt <> i64 conversion. The MVP threw an exception when using i64. Now this is cleanly supported: when we have an i64 it casts to and from BigInt, just like writing to a BigInt64Array.
Mutable globals. Originally globals were immutable. Now when you have a Wasm module that exports a global it gets exported as a global object. I think it's supported across browsers now. As new Wasm features get added it requires changes in the JS API. Similar for multi-value. For each of these things it requires some thought for what the JS API ought to be. For multi-values we used iterators, which may be overkill but is the most JS-y thing to do. There are many small decision points where we decide to do the more or less JS-y thing, so it helps to have JS people in the room.
One important thing is reference types. There're two types: externref and func ref, used to refer to arbitrary values from the host. Extern ref can refer to any JS value, and func ref can refer to functions. Initially functions could only be held in tables and referenced as scalars. Extern refs are important when we think about GCed language in Wasm, the GC needs to be able to trace those. There're fundamentally two approaches to GC in Wasm: where the GC is implemented within Wasm and traces the linear memory, but if we want to support cycles between the host GCed language and Wasm the host GC needs to be able to trace, and that externref enables though. Externrefs also important in casts.
Finally there's SIMD. SIMD JS was cancelled since it was deemed too hard to make it work in unrestricted JS, and if it's just gonna be for asm.js we might as just make it in a Wasm feature. SIMD Wasm uses the same kind of instruction set we planned for SIMD JS. But it's good we followed through that we moved it out of TC39 and into Wasm.
DE: Any questions about the standardized proposals?
SFC: I want to share our experience with Wasm. We have an intern for IC4X project where we compile to wasm and use it from within JS. Two main pain points we encountered: one of them is the FinalizationRegistry where we're able to register Rust-space objects into the FR, but the issue is there's no way for us to tell the FR how large the object is or give it any sort of ranking for when it should be collected, etc. Other issue, which we've found a workaround, is structured values especially return values as a pain to use. If Wasm returns a structured value it requires writing wrapper code to allocate space on the Wasm side, return that, and having to pull each value out of the wrapper. I know there're wrapper libraries to help with that but it'd be nice if the core Wasm API was able to take structured value on the Wasm side and put it into a record or an object literal.
Some of the features you've brought up like multi-return values sounds like it might help. But wanted to share the experience.
BFS: Last I heard for the whole string debacle was that only UTF8 would be supported.
DE: Let's get to strings later.
About FinalizationRegistry, this question of being able to configure the weight of the things when registered came up when we were talking about WeakRef. Issue is any API we provide here has to be resilient against a bad actor. One possible way we can keep people honest is to make sure these weights were backed by a TA, or if they were normalized somehow. If there's no clear interpretation I don't want to risk browsers just ignore it.
KM: I'm interested in what the issue you ran into was and how weights would help.
SFC: Issue is when we have a Wasm object backed by a JS object, what can happen is we're only able to flush the Wasm heap memory when the FR says we can do it. FR tells us maybe every few seconds or so that there're new objects we can GC. By that time the Wasm heap can be full and we'd need to extend the Wasm heap even if we really didn't have to do that if there are in fact objects that are already GCable.
KM: So your issue is that GC isn't run frequently enough.
SFC: Another way to solve this is maybe to add a hook to run the GC first before we extend the Wasm heap.
DE: I don't think we're gonna add that API because Java has it and JVMs just ignore it. Not sure what we can provide that doesn't have the same kind of moral hazard.
LHN: I think this is what you pay for by having GC in the first place. You use more memory than you need because GC is balancing collection time with memory.
KM: The usual way is to add a weight for things like TA memory.
SYG: Yeah I don't see a way around the moral hazard DE talked about. Granular APIs like V8 API provides or in the limit, a "GC now" button, both bad ideas.
DE: For wrapping structured data, if you're copying memory I'm not sure what the advantage is to building it into the API.
KM: Also it's language dependent, we don't know what your memory looks like on the Wasm side. For GCed objects I think the intention is to use the typed object proposal Shu was presenting in TC39.
SFC: The surprising thing was you don't need to do conversion for f64 but for things that are wider you'd need to allocate space on the Wasm side.
KM: But each language is different right, you'd need to do this at the Emscripten level. What Shane seems to want is automatic. It's hard to do generally for all languages, so you'd need to do it at the tooling level that has knowledge of what the source language is. Doesn't seem like a Wasm feature itself. Seems like we shouldn't deep dive into that right now.
DE: Continuing with things that were added. One cool thing is exception handling, letting you throw and catch exceptions in Wasm. For Igalia we've been working on EH in SM alongside Chrome implementing it. Some programs we're trying to optimize, doing EH proposal without this proposal requires a ton of back-and-forth with JS and ends up being very expensive.
ESM integration is a funny one. This is something I worked on a while ago and even though there's a specification we haven't gotten browser interest to push it over the line. There are imperative ways right now, but if we could do it declaratively we could treat it as a ESMs. There's an partial impl in JSC that my coworker Asumu is working on.
Typed Function references give actual function types that serve as first class values so they could be called without extra type checks at the callsite.
Suspender API lets Wasm use async APIs but letting Promises suspend and resume Wasm execution so Wasm code can interact with Promise APIs.
SYG: Are the ones in your "Advanced Proposals" not "Early Stage" because their designs are settled?
DE: Kind of yeah, don't know if people agree.
LHN: Some are more controversial than others.
DE: Module linking. Idea is maybe we'll have more than one module in the same Wasm file. Instead of mapping against the same global module map, module linking is designed to enable more virtualization. I think it'd take a lot of work to make these look like ESMs. There are lot of TC39 people interested in ESM having some way to virtualize the subgraph. Module linking lets you do this in Wasm declaratively. With the JS Compartments proposal or the old module linking proposal you do it imperatively. The idea of module linking is you do it declaratively so you don't need to solve it in a host-specific way in a host API. I think it's worth thinking through what this would look like as ESMs and do we want to add similar capability to ESMs. I think it's fine as a starter to have Wasm be more capable.
WasmGC. Design is in flux. Core proposal is to allocate GCed values such as arrays or structs that have typed fields. It also includes the core capabilities of subtyping and casting towards the general goal of enabling something like Java or OCaml to be compiled to Wasm. Other big producers here like Kotlin, Dart, Scala, something Java-like. Go and Python also have important Wasm impls but I think their model for how objects are represented are more far out for WasmGC. Ultimately these arrays and structs have to be represented to JS. This gets a bit tricky. Asumu worked on potential APIs for that. We want JS to act on those same values. We could also have a JS struct API. But JS structs are different, it's not expected that JS structs could be exported to Wasm, but the Wasm->JS direction should work.
There's a big design space here. We could have an imperative API, or a declarative API like a custom section. Ultimately I think it's important to get JS developer input here. If the resulting API is too unfriendly people will end up having to make wrapper classes and cause performance issues at the Wasm/JS boundary.
MAH: How is WasmGC proposal linked in any way to actual GC?
KM: It's not about collection like the app itself being able to GC, it ties into the host's GC.
MAH: How does it tie?
KM: When you make a new object it'll allocate an object in the VM's GC.
DE: It lets Wasm work with GCed values. Previously you could work with linear memory values.
MAH: How is it any different than having a representative on the JS side and having weakrefs and FR informing the Wasm side the object has been collected?
DE: Good question. One cost of that is you'd have to have wrappers. A more fundamental cost is such an approach doesn't work with cycles. It's common to have cycles.
MAH: I don't see how it solves cycles.
DE: There's no destruction, it's just GCed.
LHN: You're making the assumption linear memory is still in use case, but it's not the case. It's all GCed storage.
MAH: What I was missing was there's no linear storage.
DE: Built-in string type. This is kind of a controversial proposal. What if we had a string that didn't have to be copied and could be passed back and forth between JS and Wasm. Could be cool, copying things is bad. Could be pretty hard, would need a new opaque type to handle this.
Strings also came up in context of interface types. These days I don't think browsers are working on interface types. Mostly implemented by Wasm in server architectures. One of the decisions recently was that Strings would be required to be valid and have no unmatched surrogate pairs. The types here are only meant to be used at the boundary between big components, whereas in the GC proposal is meant to be used at the granular case.
I'm a little worried about the future of interface types. I was initially excited about its being the basis for reduced overhead between Wasm/JS. At this point I'm not sure what the story is for interface types' dealing with Wasm/JS.
Ross Tate: I've been talking with some interface types people about developing a lower level version of interface types at the Wasm level. Basically it'd be primitives to enable the fast interop that interface types can build on top of. I've written up some text and that's about it.
DE: Do you think that lower level proposal works with JS?
Ross Tate: Not sure if w/ JS the language but with library interop use cases, like doing string transfers or AB transfers more quickly. Looking at what we can and cannot do with it.
DE: My intuition so far is that if you want to use ABs between Wasm and JS, that kind of works. If you want strings, I think we should have a string type so you can do it without copying. A lot of the interface types discussion has been abstract, I hope we can make it more concrete.
There's also WASI for talking with the underlying OS. AFAICT it's mostly for Wasm on the server so I'm not sure how it's supposed to work on the web or with JS.
This concludes my presentation. How should we all work together to make these useful?
SYG: Where there is a capability gap between JS and Wasm, I think we should periodically take stock to see if the extra Wasm capabilities make sense to be reflected into JS. Otherwise my hunch is people are going to just use the Wasm capabilities anyway except at the boundary it's worse: leaks, performance issues, whatever.
DE: I strongly agree with that and the other direction too. JS capabilities not reflected into Wasm could affect Wasm adoption.
LHN: You have to worry about languages that compile to Wasm that may not need to use those capabilities.
MAH: I'm not super familiar with the Wasm specs. When I tried to look at them before, I found them hard to digest. Finding info on what's up to date and where things are headed without parsing complex documents was difficult. For someone who's new to Wasm coming from the JS side, having documents explaining general architecture and explainer for dummies or JS people in general would be helpful.
DE: The MDN documentation is very good and I recommend that. For proposals, it's hard to even keep the proposal READMEs up to date so I'm not sure we can ask for secondary documents too.
PFC: That's exactly my experience. Formal notation is good but not good to get a sense of what something does.
DE: You know they say our spec is also bad. If you're confused by symbols, look for prose above the symbols that try to reword it in words.
Ross Tate: I think documentation could be improved if it's causing people problems.
IOA: In the spec there's a directory called proposals, and in it is an explainer for every proposal that's added. (https://github.com/WebAssembly/spec/tree/master/proposals)
SFC: To address Daniel's question. I'd be really supportive if someone from the Wasm group could give an update at a quarterly TC39 meeting, just as a way to liaise. Perhaps TC39 could do the same.
SYG: My take is it'd need to be proposal focused, since most of the technical discussion is probably not interested in either direction.
SFC: I liked Dan's presentation today giving an update on these proposals. I really value these updates and would like to have them more frequently.
Ross Tate: Might be worth it to have a quarterly meeting.
SYG: I'd like to close with a question: is the direction Wasm CG want to go to be more host-agnostic or to special case JS? I contend that JS is special and deserves a special place.
Ross Tate: Community is still figuring out how to navigate this space.
DE: In reality nobody really works on the Wasm/JS API in the Wasm CG.
JKW: To point out the obvious we have a Wasm JS API, so JS already has a special place.