CG-06-06.md

June 15, 2023 · View on GitHub

WebAssembly logo

Agenda for the June 6th video call of WebAssembly's Community Group

  • Where: zoom.us
  • When: June 6th, 4pm-5pm UTC (June 6th, 9am-10am Pacific Time)
  • Location: link on calendar invite

Registration

None required if you've attended before. Send an email to the acting WebAssembly CG chair to sign up if it's your first time. The meeting is open to CG members only.

Logistics

The meeting will be on a zoom.us video conference. Installation is required, see the calendar invite.

Agenda items

  1. Opening, welcome and roll call
    1. Opening of the meeting
    2. Introduction of attendees
  2. Find volunteers for note taking (acting chair to volunteer)
  3. Proposals and discussions
    1. Announcement: Next hybrid CG meeting, tentative dates in Oct 2023 [Deepti Gandluri, 5 mins]
    2. Announcement: Github roles and permissions cleanup (see #1215 and #1233) [Derek Schuff, 5 mins]
    3. Compile time imports and string builtin API [Ryan Hunt, 25 mins]
    4. GC & strings: discussion [25 mins]
  4. Closure

Agenda items for future meetings

None

Schedule constraints

None

Meeting Notes

WebAssembly logo

Agenda for the June 6th video call of WebAssembly's Community Group

  • Where: zoom.us
  • When: June 6th, 4pm-5pm UTC (June 6th, 9am-10am Pacific Time)
  • Location: link on calendar invite

Registration

None required if you've attended before. Send an email to the acting WebAssembly CG chair to sign up if it's your first time. The meeting is open to CG members only.

Logistics

The meeting will be on a zoom.us video conference. Installation is required, see the calendar invite.

Agenda items

  1. Opening, welcome and roll call
    1. Opening of the meeting
    2. Introduction of attendees
  2. Find volunteers for note taking (acting chair to volunteer)
  3. Proposals and discussions
    1. Announcement: Next hybrid CG meeting, tentative dates in Oct 2023 [Deepti Gandluri, 5 mins]
    2. Announcement: Github roles and permissions cleanup (see #1215 and #1233) [Derek Schuff, 5 mins]
    3. Compile time imports and string builtin API [Ryan Hunt, 25 mins]
    4. GC & strings: discussion [25 mins]
  4. Closure

Agenda items for future meetings

None

Schedule constraints

None

Meeting Notes

Attendees

  • Andy Wingo
  • Yuri Iozzelli
  • Derek Schuff
  • Justin Michaud
  • Jeff Charles
  • Alon Zakai
  • Conrad Watt
  • Slava Kuzmich
  • Adam Klein
  • Ryan Hunt
  • Keith Winstein
  • Shravan Narayan
  • Paolo Severini
  • Bruce He
  • Brendan Dahl
  • Luke Wagner
  • Sam Clegg
  • Dan Gohman
  • Ioanna Dimitriou
  • Tal Garfinkel
  • Dan Phillips
  • Petr Penzin
  • Mattias Liedtke
  • Daniel Lehmann
  • Jakob Kummerow
  • Heejin Ahn
  • Calvin Prewitt
  • Emanuel Ziegler
  • Thomas Lively
  • Benjamin Titzer
  • Daniel Hillerström
  • Mingqiu Sun
  • Andreas Rossberg
  • Ilya Rezvov
  • Nick Ruff
  • Francis McCabe
  • Zalim Bashorov
  • Nick Fitzgerald
  • Chris Woods
  • Andrew Brown
  • Sean Jensen-Grey
  • Dan Philips
  • Ashley Nelson
  • Kevin Moore
  • Johnnie Birch
  • Peter Huene

Discussions

Announcement: Next hybrid CG meeting, dates in Oct 2023 [Deepti Gandluri, 5 mins]

DG: Next hybrid meeting will be held on Oct 11th, 12th in Munich. Registration form, agenda, and other details will be shared as we get closer to the date. Please email the chairs if you have any questions, or need an invitation letter for travel requirements.

Announcement: Github roles and permissions cleanup (see issue #1215 and issue #1233) [Derek Schuff, 5 mins]

DS: We will be cleaning up some of the permissions for the WebAssembly github org, the details are in the linked issues. Right now, the permissions are inconsistent and some folks have access are no longer working on Wasm for several years. This is an administrative for making permissions and access cleaner, and more secure. No changes to how proposals are advanced, or should not affect what we do in the CG on a regular basi.

Compile time imports and string builtin API [Ryan Hunt, 25 mins]

RH presenting slides

CW: Are the encodings going to close over the JS side, or would Wasm know about them?

RH: Getting to it in a couple of slides

CW: I would expect there to be.. From a Wasm point of view, is memory a first class argument that you have to pass in?

RH: yes, concretely how this could be done is you’d have JS code that instantiates your module (like today), that would create the actual memory object, it could be imported into your module and have this memory type, and when you invoke it would be a first-class value.

CW: Does this require first class imports?

RH: No, just type imports

BT: You skipped over type imports, how does it work exactly?

RH: with type imports, in that proposal there’s an idea where you have an extension to import types from the JS API. I just called it builtin type but there’s some design work to do. You create a module with a builtin type and in the API youd have string.type. And it would only be values that are JS string primitives, so everything would be typed by importing that. In this example this iw aht you could do if you haed ad type for webassembly.memory, not a first-class memory.

CW: How do you call that from within Wasm? RH: I’ve seen this before in JS glue code, you can export your memory. Maybe in this case it requires your memory to be instantiated before th emodule, then you’d use global.get

CW: it seems this would require you to instantiate the memory first, and then import that and close over the memory

RH: you wouldn’t have to close over it but I h tink you do have to export it

BT: The WASI was is that you export the memory, and then WASI knows how to bind it, emscripten does something similar, where it binds it later, do you can’t do this in a way that you define it and then bind it later

CW: making that fast in the engine is going to take a lot of specila magic, right? To generate the code that knows where to find the memory you’d need to use lots of indirections, or do special magic at compilation time, right?

TL: If you declare your memory in the module and then export it, the JS code then writes a mutable global, and then it writes it back through an externref, you don’t need a type check on that because you use it as an imported typeimport, then you know that the engine knows its a first class reference to a memory without memoryref

CW: every access to memory through this f unction has to go th rough that externref

LW: Presumably there is a first stage where it all uses externref (and there’s extra runtime type-checking overhead)

TL: but you don’t even need ot tpyecheck that because you have this typed import, so you just have a reference that the engine knows is a first-class reference to memory, even though we haven’t added memoryref to wasm.

CW: im just wondering, can an engine make this as fast a a proper instruction that can get directly to the webassembly memory?

KW: : It would be nice if these builtins were applicable to any memory in a multi-memory module, and ideally without the instantiating code needing to know a priori how many memories are declared in the module.

AB: There have been discussions about switching WASI from memory export to memory import: https://github.com/WebAssembly/WASI/issues/502

RH: There would be loading a data pointer and a bounds check, so there is some indirection, but there are other ways of doing this, where a UTF8 string doesn’t close on the memory

BT: type imports are a thing we’ve kicked around for 5 years but haven’t gotten a design that has moved forward. Is there a dynamically typed version of this API that would have dynamic type checks at the boundaries?

RH: Open question about what happens if you have this without type imports, and everything is externref, is this forwards compatible, and would it have sufficient performance? Not a clear answer to that

CW: at some point the question is, how much extra do you get from even having compile time imports as opposed to just regular imports? RH: Depends on the nature of the type check, if ou have a cheap type check, a quick type check and an OOL fall back code, it might not be too much overhead, it’s not a certainty, we would have to pin something down

TL: with the GC MVP being almost done, I think we could actuallly move pretty quickly on a type imports proposal as a follow up. But as we finish GC we could actually give it attention and move it pretty quickly.

BT: Do you mean that just CPUs are freed up?

TL: yeah I just mean brainpower to think about it.

AR: there’s certainly a dependence on the funcref proposal, which we’ve tied to GC, since it defines what typed refs are. So it would depend on that.

KW (chat): (It would also be super-nice if the builtins included a community-maintained Wasm implementation that could be provided as an import, so that the non-Web engines can just compile that module and link it, rather than every non-Web engine having to maintain their own implementation in host code.)

RH: Going to get through the last sets of slides: thoughts on stringref

BT: There’s a lot of complexity in string implementations, and when you consider that against what’s already in Wasm, and then adding additional complexity of hiding the representations in the engine, its a complexity that keeps mushrooming, this is a good step forward to keep it out of the core engine

NF (chat): one could imagine specializing for Reflect.get builtins and all that similar to the string operations, which is neat

AW: I have a couple of comments, one is the necessity of builtins for shareability, it would be nice if we could get the functionality without adding a new thing, then we can do this by passing by reference. Any functionality that has the right shape, there are a lot of things that aren’t in a good shape. You can still use the func.bind, even without builtins.

Another: a thought experiment: if you remove string.concat and equality from stringref, you remove a source of nondeterminism, but these will certainly be added back in as builtins and be recognized by entines, so languages will have expectations about the complexity. If we had a string type that worked on web and non-JS, then this ability to specialize performance of strings wrt what web engines do is a risk anyway, standardized or not. Just based on what source languages start to encode in their engines, just a few comments

RH: on the first part, could we reuse the existing definitions of JS global space: i’m open to finding a way, i agree adding a new thing has risk and would be good to avoid that. Details are unclear. The second thing, a quick clarification question: you’re saying there would still be a stringref type but you’d import the functions that run on it?

AW: That would be the thought experiment, if the objection to stringref is the unclear cost of concat etc. you would still incur the cost of this anyway,

RH: yeah, I think if you wend down that road of a stringref type, with the mutation ops are builtins you end up in the same long-term result. Probably worse, since they aren’t instructions but imports. My hope with going to import a string type, if you compile for the web and want to reuse the JS string type, that you know what cost you need and will pay it.

But when you’re compiling for something else, you can use whatever is the cost effective thing on that platform so you don’t incur the cost of it if you don’t need it.

AK: wanted to respond to the first point on the last slide about motivation for stringref. There are really 2 motivations. My presentation focused on the urgency of sharing strings across the boundary. As ben said there is a huge amount of code in engines to get good string performance. If we don’t provide that in the platform, each toolchain will have to do that in userspace. There are conceptual advantages to that. E.g. kotlin/dart/java have all expressed interest not just for interop but internally.

The Kotlin folks have been thinking about how their engines would run on non-web engines, stringref is a way to deal with that complexity, there are a lot of things you might want to do where the language uses UTF8 as its encoding. There are 2 big performance benefits. One is interchange and the other is simplicity, where they can lean on the engine and still get good performance.

RH: That’s an interesting experiment, what is the peak performance you cna get with that compared to Wasm GC? It will be performant but at a big cost of complexity, but it won’t be universal across languages, stringrefs are the lowest common denominator. Maybe this is where you get into the philosophy, should we have big common runtimes that everyone tries to use, but I think it would be better not do.

AR: These two different use cases are also in conflict with each other, one discussion was should string ref allow eqref, or be a subtype of it, many languages would want that, but that would run into problems in JS where you don’t want to inherently add it. These are all in conflict, so its not clear what to do. The proposal has to be clear about the scope, JS API seems to be the right choice in taking this forward

BT: I would say that in some sense wasm has been an exercise in exposing those things from below in the machine to code above in making things useful and standardized, and this has some flavor in taking a thing in the web and moving it down. This is maybe the pointo where we should explore what are the mechanisms where we talk about external things. This is an opportunity to get it right. There will be other kinds of embenddeings, e.g. databases, where want to expose things for databases, and we’ll want to get that fast, so type imports could be a way to get there.

AK: I’m interested in stepping aside JS sharing, interested in getting thoughts about Wasm engines provide a stringref functionality. Would be interested in toolchain folks - not as an interchange format, but providing a functionality. GC is an example of us providing something that’s not at the machine level that people need. Comparing strings to databases, there’s a long gap between how fundamental those two are. Toolchainss are a bit part of Wasm development, and is an important part of the discussion.

RH: [portability and polyfills slide] one thing, I have one extra slide, not sure if its relevant. It comes up, Is it desirable to.. If a toolchain wants to use string ref on the web. It can be sort of portable across web and non-web environments because its in the platform. A way we can make it better, IMO, I think the web environment is unique and difficult enough to target, its seems inevitable that people will want to have different code and bindings to target the rich environment. There’s been concerns about toolchains haveing web/non-web environments, I think its inevitable. It’s possible that to ease the transition for some toolchains, what if we create a polyfill module built on GC that’s compatible with the builtin API. if you had a tool that could inline this into something that imports the builtin API it would be very good, and allow compatibility between the different environments.

SJG (chat): I would hate to see web and non-web Wasm ecosystems diverge.

BT(chat):I honestly think that Wasm will give rise to endless ecosystems, and that is a good thing (TM)

KW: I like the idea that there would be standardized set of imports that engine can agree on. How easy would it be to maintain a standardized module that implements these functions,? Rather than all the engines having to maintain a set of functions etc. then non-web engines could just link to it instead of working with their complexity

RH: I think I agree with the idea that it would be desirable if someone would create a polyfill that would emulate these web builtin primitives. It's an open question if this is something non-web environments would want to have themselves. IMO it would be better to have this in toolchains that they would bake in rather than having all the environments/engines have.

KW: The idea is that it would be a valid implementation, if the browser wants to provide an implementation backed by its native implementation thats fine, for an engine that doesn’t have its native implementation then it would just use the module, and it would be normal linking. This does create constraints on the Wasm API, having the constraints of what’s implemented, it would be good to have it upfront.

RH: going down this road i’d be interested in seeing, is there a benefit for non web engines to have and add to the user code, vs users bringing it themselves. If it’s part of the embedding layer that runtimes provide, I don’t think that’s good, I think it’s better for tools to bring it.

KW: I agree, the point would be to free the non-web engines from doing anything

BT(chat): A benefit of the “wasm polyfill” module is that its performance characteristics are transparent.

SJG (chat): I agree, but they should share a common chirality so the DNA is compatible. This is more of a feeling than a concrete objection. I like the idea of Wasm polyfills as a mechanism for portable behavior

NF (chat): and it can have breaking changes managed by semver or whatever, updates that are independent from the engine itself, etc because it is all under control of the toolchain

AW(chat): prevents you from using built-in intl though

BT(chat): couldn’t that intl also be specified too?

AK: Agree that at the interface layer we see web/non-web do different things. Strings show up in the source language without any compute interface. It’s a slippery slope the other way. In the compiled to JS space, languages had to deal with not having 64-bit integers, I feel like imposing sort of thing on languages that will be targeting wasm in the future, for a language that does want to strings. My worry is that people will depend on that for more than interchange, it would be nice if there was a polyfill of that, if the polyfill has to be performant then that creates a high bar. Wasm in isolation doesn’t provide everything folks need, so don’t understand the argument for keeping Wasm pure, strings are more fundamental than that. There’s going to be some difference in web/non-web. Strings doesn’t seem like the right place to make the cut

AR: I think underlying this argument is the assumption that an interface like that is equally useful for JS usage as it is for internal runtimes. I think that is fundamentally not true. If languages didn’t care about web interop, this interface would be suboptimal. As ryan pointed out the web brings a lot of problems that you only need to solve if you are there. For internal use you’d do something different. A lot of languages would need something different anyway, since they have different constraints. So there are different assumptions about how broad an audience this is useful to.

Chris Woods: yes, I think andreas had a good point. From a non-web point of view there are many use cases where we don’t use strings. In the web interfaces, there are But in industrial code it’s usually numeric, and I”m concerned about the growing size of the runtime to support these new features.

CW: this is also true of GC. How do you feel about that?

CW2: I have concerns about that too, but we should take that offline due to time constraints.

AW(chat): the question is, does the app have to ship intl? if strings are a pure toolchain consideration then yes, right? certainly on the web that sounds suboptimal :)

BT(chat): A wasm polyfill is also a good executable specification :-)

KW(chat): Naively I think the Wasm polyfill could be a lot simpler than an in-browser engine, because it wouldn't be burdened with UTF-16-related baggage and could just use UTF-8 internally.

AK(chat): But if source languages depend on UTF-16 semantics, as Java and Kotlin do, they'll bring that requirement along with them

KW(chat): It wasn't my understanding that anything in today's proposed API would guarantee UTF-16 semantics on the API boundary... maybe I missed that.

DG: next steps: probably the next time to talk about this live would be a month from now. Probably we’ll request a quick 5 minutes from Adam and Ryan to start that out. In the meantime, more async and offline discussion. Maybe we need a more centralized way to track this.

DG: one other thing, we’ve been discussing how we post notes. There’s a link to the doc in the chat, if you want to correct the representation of what’s been said you can edit the doc and I”ll post the note at the end of the week.

Closure