CG-04-23.md

May 31, 2024 · View on GitHub

WebAssembly logo

Agenda for the April 23rd video call of WebAssembly's Community Group

Where: zoom.us
When: April 23rd, 16:00-17:00 UTC (09:00-10:00 PDT, 18:00-19:00 CEST)
Location: link on calendar invite

Registration

None required if you've attended before. Send an email to the acting WebAssembly CG chair to sign up if it's your first time. The meeting is open to CG members only.

Logistics

The meeting will be on a zoom.us video conference. Installation is required, see the calendar invite.

Agenda items

Opening, welcome and roll call
1. Opening of the meeting
2. Introduction of attendees
Find volunteers for note taking (acting chair to volunteer)
Proposals and discussions
1. Announcement: registration for June CG meeting at CMU is live: https://forms.gle/ahNN9e7Nwc8W9PtQ7
2. custom-page-sizes proposal (Nick Fitzgerald, 30 minutes)
  - update
  - discussion of exact constraints for valid page sizes
  - vote for phase 2
3. poll: add table64 instructions to memory64 proposal (Sam Clegg / Conrad Watt, 10 minutes)
Closure

Agenda items for future meetings

None

Meeting Notes

Announcement: registration for June CG meeting at CMU is live: https://forms.gle/ahNN9e7Nwc8W9PtQ7

TL: CMU, Pittsburgh, registration deadline May 24th

BT: If you’re interested in giving a talk at research day email Ben

custom-page-sizes proposal (Nick Fitzgerald, 30 minutes)

NF Presenting

Slides: custom-page-sizes update

CW: when we have the “shared” annotation, where does that go in the text format relative to the page size?

NF: Haven’t done these changes on top of the threads proposal, can add an issue to the repository

CW: yeah it shouldn’t be hard to work out, I just realized we hadn’t thought about it yet.

Toolchain integration slide: CW: when you say “linking modules here” what kind of linking do you mean?

NF: Not talking about Wasm instantiating linking - so toolchain linking and not runtime linking. Linking multiple .o files

KW: from the consumer, is the demand for finer granularity for growing memory, or from the producer is the desire for memories that are smaller than one page?

NH: It’s mostly smaller than one page, it is also smaller than two pages and larger than one page

KW: Is there a desire for being able to grow those memories?

NH: Haven’t seen a lot of interest in growing. Maybe you’re getting at an alternative design for a fixed size memory, working with the existing framework for the spec and implementations we have now, it would be easier to tweak the existing rules instead of adding

KW: I wasn’t sure. It seems form the consumer perspective that some use of memory control might be adequate

NH: what kind of use were you thinking?

KW: if you have one page where only part of it could be read or written, you could use that

BV: this doesn’t seem like it would help much with embedded systems, which is what this proposal is designed to target

PP: []

NH: []

DG: runtimes usually try to reserve the max up front if they can, to try to avoid moving. But it’s not the best fit for embedded, trying to fit based on available resources.

KW: allowing the max size in byte or small page size units seems like it might be easier than trying to change the whole page size.

PP: also what is the chance that someone who wants less than one page would want 64 bit pointers?

TT: What I’ve got from the discussions is can we shrink the RAM if we need to? That was the original motivation of it - how do we not use 64-bit RAM? Is it possible to keep it orthogonal? If you don’t have the possibility of shrinking the RAM, otherwise the discussion points here seem orthogonal

CW: i think there’s a question of, if we weren’t pursuing this at all, an adjust doing memory protection, would we want a feature where we’d declare some subset of the memory not readable/writable, would we want this kind of feature too?

KW: What about a byte granularity max?

CW: I think it kind of ends up being the same.

BT (chat): +1 for the composability argument of memory control and custom page sizes

NF: I also want to point out that as far as I know, the memory control proposal does not make things inaccessible, it just zeroes them out. It maintains read/write access to the discarded memory. That won't be usable to implement small memories.

BV: yeah that’s the discard component of memory control. I think a custom page size composes well with the memory features. I don’t see a lot of conflict there, and I also don’t see many use cases for that in the embedded case

KW(in chat): From my point of view, it does seem simpler for the consumer and toolchains if there's simply a static "byte-granularity max" on memories vs. if custom page sizes have to be plumbed through the object format and supported by every runtime (especially through memory.grow, page sizes less than the platform page size, etc.).

NH: Moving onto discussion

JK (chat) is there any use case for byte-granularity memory size control? What if the page size choice was "either exactly 64KiB or exactly 1KiB"?

CW: In favor of option 2 as the most conservative option, that can be extended in the future. If it turns out there’s an embedded runtime that really needs specific page sizes, we can vote on that when we need

FM(chat): +1 for conrad's perspective. Also, +1 for simply specifying max memory size in bytes.

DS: I Like that option as well - how would toolchains choose between different page sizes, and different users + applications would make different choices even for the same use case, and we have to deal with them. Having just a defined option would make things simpler

BT: Generality unlocks usecases we haven’t thought of it yet - it may not be easy to do in toolchains, but not all toolchains are the same. They may not have a narrow hardware in mind, there are x86 platforms with 4k pages upfront. Also engine work: implementing the general thing is more work up front but even the lightweight spec process means more work in engines.

BV: PArticular difficulty with arbitrary page sizes are lower than the platform page sizes that how do we support them? 1 is a special case where you can decide not to do virtual memory tricks, but more than that, what should engines do?

NF: FWIW, that's what I intended to do with wasmtime. We support running without virtual memory guard pages, even when we can. For most page sizes, we would just fall back to that.

BT: that’s what I expected, that the implementation burden for arbitrary page sizes: below some arbitrary size, you just don’t use bounds checks anymore

CW: That’s one of the scenarios im worried about, where the engine uses the smallest page size for doing virtual memory tricks, will lead to mid sized pages not being able to optimize, this will lead to a mudddled ecosystem

BV(chat): +1 to the concern of unclear performance

TL: I agree that option 2 makes it clear: you can be small or fast

NF: given the way the discussion is going, we should probably move forward with option (2), we don’t really seem to have consensus for option1

CW: I’m In favor of arranging of the spec so that we’re able to make minimal edits when adding custom page sizes

NF: yeah expect it won’t need much additional engineering work for runtimes either at that point. It’s more of an eco

JK: Is there any use case for bytewise control over the memory size? Don't want to check for bounds in the middle of an i32 load. If the minimum size were the size of a v128, that would be convenient.

NH: There was some discussion about that before, we can make the minimum page size the largest load or store page size. There’s no guarantee that it would stay that way of course, if we have flexible vectors or something.

JK: so make the minimum size 1KB, does that block any of the use cases?

TT: From my point of view, going down to 1 doesn’t make sense, i was thinking of 1k, 4k, 16k as a good stage, not to have an arbitrary number of log2, just have customizable settings that you can choose from that you can then make reasonable choices about performance

CW: I think if we had, say, 1k, 4k, 16k, i would still have most of the same concerns about unclear performance since we’d have systems that might make different decisions about performance at the intermediate sizes.

NF: What does it actually get us in terms of simplifying engine implementations? My understanding is that then you have to do a full bounds check if you can’t do virtual memory tricks

As soon as you can't rely on virtual memory at all, you have to do a full bounds check that includes the size of the access, so it doesn't matter in practice whether the bounds are aligned.

JK: it does matter. If you have a decently aligned memory size you can check the address as being less than the limit, whereas if it could end at any byte then you have to add the size and then check the bounds, its another check on every load.

NH: Doesn’t that rely on having guard pages though?

CW: Are you saying that you have to do a second more precise check?

JK: no, if your memory can end anywhere you have to do addition arithmetic then you have to check the end of the access as opposed to the start

TL: but not all accesses are aligned, you could still have i32 loads fall off the end.

CW: You could even have this today, where you can have an i64 load at the end of the page

JK: you’re right, for unaligned loads that problem exists already.

NF: if you have at least one guard page, you can use that and rely on that for overflow. But if you don’t you still have to do the full check whether it’s 1 byte or 1k. So in that case I’d prefer 1, since what benefit is there from the coarser granularity there.

CW: I think maybe calling it a "one-byte page size" sounds silly. I've been thinking about it like just turning on explicit bounds checks.

NH: It’s more of a how do we fit into the existing spec

DG: a question about that: it’s probably true of wasmtime would want this to fit well into the existing framework. Is that also true of other engines that are targeting embedded systems?

NF: I can't really speak for other runtimes

KW(chat): From my point of view, it does seem simpler for the consumer and toolchains if there's simply a static "byte-granularity max" on memories vs. if custom page sizes have to be plumbed through the object format and supported by every runtime (especially through memory.grow, page sizes less than the platform page size, etc.).

DG: curious if there are cases where we want the higher customizability. I was wondering if there was more that we should to in terms of outreach to find those.

KW(chat): What is the right procedural mechanism to discuss the merits of "1-byte page sizes" vs. "1-byte-granularity limits"? Another proposal? Right now?

CW: there are at least 2 asking for more discussion about whether we should think of 1 byte pages vs granularity limits. Should we still go to phase 2 with this unresolved?

TL: relative to the size of the proposal it seems like a big enough deal that maybe we should resolve it before going to phase 2. I wouldn’t want it to drag it out, maybe we could timebox it to a couple of weeks or something.

NH: FWIW, it comes down to one line of code in the spec + validator, the biggest change is making the page size anything other than the constant size it is right now. The question is important, but it’s not the

DS: We should agree on the mechanism of the proposal before the vote to phase 2 - it seems like we are making progress so I’ll agree with Thomas that we can wait for a couple of weeks

JK: one other point to raise: maybe a way to get the concerns of different engines and use cases accounted for, is spec it more flexibly: maybe a way to allow the size to be rounded up somehow. This would make it more agreeable if we could keep doing what we’re doing. And guard pages should be orthogonal, it could be an explicit mechanism rather than piggybacking on weird memory sizes that make guard pages impossible.

FM(chat): is there any use case for byte-granularity memory size control? What if the page size choice was "either exactly 64KiB or exactly 1KiB"?

NH: I have no qualms with about some kind of hinting proposal - it is a fact of the way things work. That if you have a page size, wa

Regarding rounding up, that would require introducing new instructions

KW: The question is whether OOB access would be a deterministic trap. If I can hint that I only use the first N bytes of memory and after that it might or might not be a trap, then that would allow all the desired implementation strategies.

CW: I’d be pretty uncomfortable sneaking this kind of nondeterminism into the proposal.

NF: in general we do try to minimize nondeterminsm, and then … []

KW: Ok, I withdraw this remark.

DG: we are running short on time, we should work offline and schedule more time to discuss in a meeting

NF: can we do it in a month?

poll: add table64 instructions to memory64 proposal (Sam Clegg / Conrad Watt, 10 minutes)

SC: we took a straw poll on GH, it seems worth doing a real poll in this meeting

Poll: Inclusion of table64 in the memory64 proposal

SF: 2 F: 14 N: 9 A: 0 SA: 0

CW: Ryan gave a pretty good comment in the linked issue on how we should support table64

2 questions; the first was should table.grow return i32 or i64; I think i64 to match what we do for memory. The second was JS api access of 64 bit tables - 64 bit ints or implementation limit 53 bit ints? I think we should also do whatever we do for 64 bit memory JS api