CG-02-28.md
March 1, 2023 · View on GitHub

Agenda for the February 28th video call of WebAssembly's Community Group
- Where: zoom.us
- When: February 28th, 5pm-6pm UTC (9am-10am Pacific Time)
- Location: link on calendar invite
Registration
None required if you've attended before. Send an email to the acting WebAssembly CG chair to sign up if it's your first time. The meeting is open to CG members only.
Logistics
The meeting will be on a zoom.us video conference. Installation is required, see the calendar invite.
Agenda items
- Opening, welcome and roll call
- Opening of the meeting
- Introduction of attendees
- Find volunteers for note taking (acting chair to volunteer)
- Proposals and discussions
- Type system properties (see https://github.com/WebAssembly/function-references/pull/91) [Andreas Rossberg, 25 min]
- Closure
Agenda items for future meetings
None
Schedule constraints
None
Meeting Notes
Attendees
- Conrad Watt
- Deepti Gandluri
- Derek Schuff
- Yuri Iozzelli
- Jeff Charles
- Jackson Hong
- Ben Titzer
- Yury Delendik
- Ben Green
- Thomas LIvely
- Alex Crichton
- Paolo Severini
- Ilya Rezvov
- Adam Klein
- Alon Zakai
- Daniel Phillips
- Saul Cabrera
- Ryan Hunt
- David Piepgrass
- Francis McCabe
- Daniel Hillerström
- Chris Woods
- Andreas Rossberg
- Jakob Kummerow
- Sean Jensen-Grey
- Marat Dukhan
- Shoab Kamil
- Zalim Bashorov
- Rich Winterton
- Heejin Ahn
- Emanuel Ziegler
- Brendan Dahl
- Andrew Brown
- Chris Fallin
- Rich Winterton
- Shravan Narayan
- Sergey Rubanov
- Asumu Takikawa
- Petr Penzin
- Jlbirch
- Nabeel Al-Shamma
- Kevin Moore
Discussions
Type system properties (see https://github.com/WebAssembly/function-references/pull/91 )
AR Presenting slides
(Principal types for wasm examples slide):
BT: Would it make sense to quantify type level parameters?
AR: yeah this is a meta-level thing. I’ll point to where I wrote it more precisely, it’s in the appendix of the fucref spec draft, more high level here. But you enrich the type grammar with these variables and quantify on the meta level rather than in every instruction.
(forward principle types slide): FM: The principal type property, there’s a related one which is what people were debating about, which is whether the type of an instruction (sequence) was inferrable.
AR: I think that’s the same one; that’s basically what principal types mean, for every unit of syntactic composition, you can infer a type for that unit without relying on the context. In wasm we don’t have expressions, just instructions that can be composed more liberally. This forward thing corresponds more to what you have with expressions, where the things on the left are like the children. Is there something you’d describe as inferrable other than this?
FM: I need to take it offline, need to think more
AR: this is the property used in meta theory about type inference, that every expression has a principal type and you can infer them.
FM: I thought every type was a supertype of itself?
AR: Reflexivity is included here
FM: so every type has a supertype
AR: yes but not necessarily a common supertype between 2 types, the interesting thing is when the two types are not the same
BT: on unreachable typing: this is one of those cases where type variables could be introduced?
AR: This has nothing to do with type variables, it only says there is some type. If you care about what are all the possible types of t*.. but this is independent of that. If you want to represent the principal split type, I haven’t thought about that yet, but you may need variables to represent them all.
PP: wrt typed references: we’re not talking only about really complex types like structs?
AR: The reference types, they could be anything. In the func ref proposal they can denote funcs, in the GC proposals they could be structs, arrays, etc, in the typed continuations, they could denote continuations etc., so more general
PP: so this applies to those proposals as well
FM: are you saying that the principal type property moves from desirable to essential?
AR: We should document it, that is the outcome we had, the desirable properties are the ones we have, we could drop that, but at least we have to think carefully of why you would need that
TL: I think this is a good direction to go in, good to have formal documented properties that are consistent with the discussions we’ve had, so it will be obvious which future instructions need annotations, for example. We also decided that this is the conservative approach, and can be revisited but this makes it explicit that we have this property now, written down, and if we wanted to reduce the number of type annotations, this is what we would have to give up. And it’s also great that we have the principal forward types property written down, since that’s the essential part.
AR: for each one in the desirable category, we know what would break them. For principal types, what would break it is omitting annotations. For GLBs it would e.g. be multiple supertypes, you would need to go full intersection types to fix that. Decomposition you would lose if you weaken the type system of the unreachable code.
TL: as far as use of GLBs in the tools, you’re right that they’re generally less important but with contravariant function parameters, you want GLBs in some form.
AR: I agree that you’re bound to run into a wall with using LUBs there if you don’t have the GLB property
TL: Is there any way to weaken the current GLB property without LUBs? Or do they imply each other?
AR: technically they currently don’t imply each other, since we have explicitly declared subtypes. When you have contravariance it is in a declared relation. The subtype relation only depends on what you’ve declared. But you might want to declare it, and certain subtypes you wouldn't be able to declare anymore in the presence of contravariance if you don’t have GLBs
BT: I don’t understand what contravariance has to do with that, isn’t oit checking is the type variance makes sense?
AR: no if you need to synthesize function types, and you need to synthesize the LUB of 2 function types, you need to synthesize the GLB of their parameter types
BT: I see
TL: Back when I implemented LUBs, I thought it would be complicated because of the contravariance, but it turned out to be simple because of supertype hierarchy so that makes sense
BT: so other than computational complexity, do any of these properties break down when we have multiple declared supertypes?
AR: The GLBs, unless you introduce them to
TL: the LUBs too, right? How would you maintain LUBs if you have multiple declared supertypes?
AR: I think you’re right, they would break as well.
TL: I’m glad we have these properties written down so we don’t break them by accident
AR: the brute force way to preserve them is union and intersection types so you can write the GLB or LUB of anything, but then you get into expensive and maybe even undecidable type system. It's unclear whether there’s a sweet spot where you get all the advantages without breaking anything
TL: For composition/decomposition you didn’t go into much detail, does the formalism in the spec include when the right hand sequence consumes values before the LH sequence?
AR: yeah that’s totally fine, it applies to any sequence with any input/output type.
TL: in that case the type in the middle doesn’t exactly match, the output of the LHS is a suffix if the input from the RHS?
AR: If the whole sequence has an input/output type and that consumes something that if the right consumes something that the left doesn’t touch, the left type just passes it through and ignores it. There are these frame-like subtyping rules where t1* -> t2* can always be weakened to (t* t1*) -> (t* t2*) for any t* on both sides. It’s basically a supertype of the more precise one, it loses the information that t* isn't used
TL: how does this interact/what is the status of the proposal to simplify dead code validation?
AR: That proposal would break decomposition, as for the status Conrad?
CW: the status is essentially when some critical mass of people decide that they’re fed up with thinking about the current system, we can push it forward, I think i’m not going to speculatively push it
TL: I think there isn’t a critical mass of people that think about typing unreachable code
CW: a lot of implementations that allow things that are not technically spec-conformant, e.g. webkit and Binaryen.
BT: Do you have those examples? Would be goot to add tho the testsuite
AR: We already have quite a few tests that check this, so do implementations disable them?
MD:: I’m not sure this applies here but one case is in benchmarks where the sequence serves as a sink for the value and needs to accept any value
BT: I think for that you should import a function that imports that type, and then it won’t get dead code eliminated
AR: If you don’t feed the value into that function it could be dropped or optimized away, so it has to get an input?
MD: The producer of the value can be dropped
AR: since this is only relevant for types that are not bottom, it’s independent, because why would you use unreachable code in benchmarks? So it’s not really related to unreachable code. As long as we keep the property that any type of a value can be a parameter of a function it should be ok. Do you have an example where this breaks?
MD: IT triggers internal compiler errors at times
AR: Interesting, would like to see an example
BT: I think all the engines pass the test suite as it is, but I don’t have 100% confidence that it’s exactly what’s intended by the spec. So it would be good to improve that so we can align on what’s driven by the tests.
AR: The negative unreached-invalid test checks for unreachable code, not sure about the exact tests in there, but there are tests there that test them, and I would be surprised if engines pass it without implementing
TL: that would be great, if it does cause production engines to start failing the test suite then we could get more eyes on it, then we could talk more concretely
CW: If they potentially already have bandwidths
AR: We also want to make sure the proposal introduces no new problems. I doubt it atually makes implementations simpler or less error-prone
CR: I agree, the point of this proposal was not to be saving lines of code in implementation or making it faster, just to make it simpler to think about.
DG: maybe we should present about what the state of this is?
CW: would be happy to do that when we have a free slot.
TL: I could also use a refresher on what the current rules are.
CW: Happy to lump that in, you can’t understand the motivation without all the context
TL: yeah I think I understand the new proposal, but not the status quo
AR: The status quo is that there is no special rule in the types system, the implementations needs a flag but the type system doesn’t care. I would say that the new version makes it more complicated
CW: It conceptually simplifies people trying to simplify linking their type systems
BT: I would also say that we should have a status quo bias, unless there’s a clear benefit in one direction we shouldn’t make a change. All implementations have special cases, so it seem unlikely to make a large difference to implementations.
CW: Jakob has a comment in chat
JK (chat): As far as implementations are concerned, we spend as much time getting unreachable code validation right as we spend on reachable code, and we are unhappy with this distribution. So to simplify implementations, it'd be great to actually drop all differences between reachable and unreachable validation rules.
CW: we do have to make sure existing code keeps validating, so it’s best effort on that
AR: My concern is that you have far more places where you have to make a case distinction, and easy to make mistakes, whereas with the current status there are places where we have to do something but probably fewer because you don’t have two modes
CW: My impression is that a lot of implementations do it now by having a conditional flag based on whether the current code is reachable. My hope was to make that unnecessary. They may still have something like that but, hopefully it would still be simpler.
AR: Basically you would still have that flag, but you would use it differently.
BT: Little bit of complexity left over, because we introduced type annotations, implementations may have a little bit of cruft around whether they represent the spec as is in addition to the proposals in flight
AR: one of the problems we had was the interaction between the lack of type annotations and unreachable code, it made some of the cases harder to determine. I’m not sure that’s easier under the new proposal