CG-02-28.md

March 1, 2023 · View on GitHub

WebAssembly logo

Agenda for the February 28th video call of WebAssembly's Community Group

Where: zoom.us
When: February 28th, 5pm-6pm UTC (9am-10am Pacific Time)
Location: link on calendar invite

Registration

None required if you've attended before. Send an email to the acting WebAssembly CG chair to sign up if it's your first time. The meeting is open to CG members only.

Logistics

The meeting will be on a zoom.us video conference. Installation is required, see the calendar invite.

Agenda items

Opening, welcome and roll call
1. Opening of the meeting
2. Introduction of attendees
Find volunteers for note taking (acting chair to volunteer)
Proposals and discussions
1. Type system properties (see https://github.com/WebAssembly/function-references/pull/91) [Andreas Rossberg, 25 min]
Closure

Agenda items for future meetings

None

Meeting Notes

Attendees

Conrad Watt
Deepti Gandluri
Derek Schuff
Yuri Iozzelli
Jeff Charles
Jackson Hong
Ben Titzer
Yury Delendik
Ben Green
Thomas LIvely
Alex Crichton
Paolo Severini
Ilya Rezvov
Adam Klein
Alon Zakai
Daniel Phillips
Saul Cabrera
Ryan Hunt
David Piepgrass
Francis McCabe
Daniel Hillerström
Chris Woods
Andreas Rossberg
Jakob Kummerow
Sean Jensen-Grey
Marat Dukhan
Shoab Kamil
Zalim Bashorov
Rich Winterton
Heejin Ahn
Emanuel Ziegler
Brendan Dahl
Andrew Brown
Chris Fallin
Rich Winterton
Shravan Narayan
Sergey Rubanov
Asumu Takikawa
Petr Penzin
Jlbirch
Nabeel Al-Shamma
Kevin Moore

Discussions

Type system properties (see https://github.com/WebAssembly/function-references/pull/91 )

AR Presenting slides

(Principal types for wasm examples slide):

BT: Would it make sense to quantify type level parameters?

AR: yeah this is a meta-level thing. I’ll point to where I wrote it more precisely, it’s in the appendix of the fucref spec draft, more high level here. But you enrich the type grammar with these variables and quantify on the meta level rather than in every instruction.

(forward principle types slide): FM: The principal type property, there’s a related one which is what people were debating about, which is whether the type of an instruction (sequence) was inferrable.

AR: I think that’s the same one; that’s basically what principal types mean, for every unit of syntactic composition, you can infer a type for that unit without relying on the context. In wasm we don’t have expressions, just instructions that can be composed more liberally. This forward thing corresponds more to what you have with expressions, where the things on the left are like the children. Is there something you’d describe as inferrable other than this?

FM: I need to take it offline, need to think more

AR: this is the property used in meta theory about type inference, that every expression has a principal type and you can infer them.

FM: I thought every type was a supertype of itself?

AR: Reflexivity is included here

FM: so every type has a supertype

AR: yes but not necessarily a common supertype between 2 types, the interesting thing is when the two types are not the same

BT: on unreachable typing: this is one of those cases where type variables could be introduced?

AR: This has nothing to do with type variables, it only says there is some type. If you care about what are all the possible types of t*.. but this is independent of that. If you want to represent the principal split type, I haven’t thought about that yet, but you may need variables to represent them all.

PP: wrt typed references: we’re not talking only about really complex types like structs?

AR: The reference types, they could be anything. In the func ref proposal they can denote funcs, in the GC proposals they could be structs, arrays, etc, in the typed continuations, they could denote continuations etc., so more general

PP: so this applies to those proposals as well

FM: are you saying that the principal type property moves from desirable to essential?

AR: We should document it, that is the outcome we had, the desirable properties are the ones we have, we could drop that, but at least we have to think carefully of why you would need that

TL: I think this is a good direction to go in, good to have formal documented properties that are consistent with the discussions we’ve had, so it will be obvious which future instructions need annotations, for example. We also decided that this is the conservative approach, and can be revisited but this makes it explicit that we have this property now, written down, and if we wanted to reduce the number of type annotations, this is what we would have to give up. And it’s also great that we have the principal forward types property written down, since that’s the essential part.

AR: for each one in the desirable category, we know what would break them. For principal types, what would break it is omitting annotations. For GLBs it would e.g. be multiple supertypes, you would need to go full intersection types to fix that. Decomposition you would lose if you weaken the type system of the unreachable code.

TL: as far as use of GLBs in the tools, you’re right that they’re generally less important but with contravariant function parameters, you want GLBs in some form.

AR: I agree that you’re bound to run into a wall with using LUBs there if you don’t have the GLB property

TL: Is there any way to weaken the current GLB property without LUBs? Or do they imply each other?

AR: technically they currently don’t imply each other, since we have explicitly declared subtypes. When you have contravariance it is in a declared relation. The subtype relation only depends on what you’ve declared. But you might want to declare it, and certain subtypes you wouldn't be able to declare anymore in the presence of contravariance if you don’t have GLBs

BT: I don’t understand what contravariance has to do with that, isn’t oit checking is the type variance makes sense?

AR: no if you need to synthesize function types, and you need to synthesize the LUB of 2 function types, you need to synthesize the GLB of their parameter types

BT: I see

TL: Back when I implemented LUBs, I thought it would be complicated because of the contravariance, but it turned out to be simple because of supertype hierarchy so that makes sense

BT: so other than computational complexity, do any of these properties break down when we have multiple declared supertypes?

AR: The GLBs, unless you introduce them to

TL: the LUBs too, right? How would you maintain LUBs if you have multiple declared supertypes?

AR: I think you’re right, they would break as well.

TL: I’m glad we have these properties written down so we don’t break them by accident

AR: the brute force way to preserve them is union and intersection types so you can write the GLB or LUB of anything, but then you get into expensive and maybe even undecidable type system. It's unclear whether there’s a sweet spot where you get all the advantages without breaking anything

TL: For composition/decomposition you didn’t go into much detail, does the formalism in the spec include when the right hand sequence consumes values before the LH sequence?

AR: yeah that’s totally fine, it applies to any sequence with any input/output type.

TL: in that case the type in the middle doesn’t exactly match, the output of the LHS is a suffix if the input from the RHS?

AR: If the whole sequence has an input/output type and that consumes something that if the right consumes something that the left doesn’t touch, the left type just passes it through and ignores it. There are these frame-like subtyping rules where t1* -> t2* can always be weakened to (t* t1*) -> (t* t2*) for any t* on both sides. It’s basically a supertype of the more precise one, it loses the information that t* isn't used

TL: how does this interact/what is the status of the proposal to simplify dead code validation?

AR: That proposal would break decomposition, as for the status Conrad?

CW: the status is essentially when some critical mass of people decide that they’re fed up with thinking about the current system, we can push it forward, I think i’m not going to speculatively push it

TL: I think there isn’t a critical mass of people that think about typing unreachable code

CW: a lot of implementations that allow things that are not technically spec-conformant, e.g. webkit and Binaryen.

BT: Do you have those examples? Would be goot to add tho the testsuite

AR: We already have quite a few tests that check this, so do implementations disable them?

MD:: I’m not sure this applies here but one case is in benchmarks where the sequence serves as a sink for the value and needs to accept any value

BT: I think for that you should import a function that imports that type, and then it won’t get dead code eliminated

AR: If you don’t feed the value into that function it could be dropped or optimized away, so it has to get an input?

MD: The producer of the value can be dropped

AR: since this is only relevant for types that are not bottom, it’s independent, because why would you use unreachable code in benchmarks? So it’s not really related to unreachable code. As long as we keep the property that any type of a value can be a parameter of a function it should be ok. Do you have an example where this breaks?

MD: IT triggers internal compiler errors at times

AR: Interesting, would like to see an example

BT: I think all the engines pass the test suite as it is, but I don’t have 100% confidence that it’s exactly what’s intended by the spec. So it would be good to improve that so we can align on what’s driven by the tests.

AR: The negative unreached-invalid test checks for unreachable code, not sure about the exact tests in there, but there are tests there that test them, and I would be surprised if engines pass it without implementing

TL: that would be great, if it does cause production engines to start failing the test suite then we could get more eyes on it, then we could talk more concretely

CW: If they potentially already have bandwidths

AR: We also want to make sure the proposal introduces no new problems. I doubt it atually makes implementations simpler or less error-prone

CR: I agree, the point of this proposal was not to be saving lines of code in implementation or making it faster, just to make it simpler to think about.

DG: maybe we should present about what the state of this is?

CW: would be happy to do that when we have a free slot.

TL: I could also use a refresher on what the current rules are.

CW: Happy to lump that in, you can’t understand the motivation without all the context

TL: yeah I think I understand the new proposal, but not the status quo

AR: The status quo is that there is no special rule in the types system, the implementations needs a flag but the type system doesn’t care. I would say that the new version makes it more complicated

CW: It conceptually simplifies people trying to simplify linking their type systems

BT: I would also say that we should have a status quo bias, unless there’s a clear benefit in one direction we shouldn’t make a change. All implementations have special cases, so it seem unlikely to make a large difference to implementations.

CW: Jakob has a comment in chat

JK (chat): As far as implementations are concerned, we spend as much time getting unreachable code validation right as we spend on reachable code, and we are unhappy with this distribution. So to simplify implementations, it'd be great to actually drop all differences between reachable and unreachable validation rules.

CW: we do have to make sure existing code keeps validating, so it’s best effort on that

AR: My concern is that you have far more places where you have to make a case distinction, and easy to make mistakes, whereas with the current status there are places where we have to do something but probably fewer because you don’t have two modes

CW: My impression is that a lot of implementations do it now by having a conditional flag based on whether the current code is reachable. My hope was to make that unnecessary. They may still have something like that but, hopefully it would still be simpler.

AR: Basically you would still have that flag, but you would use it differently.

BT: Little bit of complexity left over, because we introduced type annotations, implementations may have a little bit of cruft around whether they represent the spec as is in addition to the proposals in flight

AR: one of the problems we had was the interaction between the lack of type annotations and unreachable code, it made some of the cases harder to determine. I’m not sure that’s easier under the new proposal