[PATCH] D104945: [WebAssembly] Added initial type checker to MC Assembler

Wouter van Oortmerssen via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jun 28 15:38:43 PDT 2021


aardappel marked 10 inline comments as done.
aardappel added inline comments.


================
Comment at: llvm/include/llvm/CodeGen/MachineModuleInfoImpls.h:113
+
+  StringSet<> MachineSymbolsUsed;
+};
----------------
sbc100 wrote:
> The classes above seem to track sets of `MCSymbol*` (see SymbolListTy).  Would that make sense here too (rather than StringRef)?
The pass that uses this runs before AsmPrinter/MCLower, thus MCContext etc is not available yet.


================
Comment at: llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp:248
+  std::map<unsigned, TypeStack> TypedLabels;
+  std::vector<wasm::ValType> LocalTypes;
+  std::vector<wasm::ValType> ReturnTypes;
----------------
sbc100 wrote:
> Why not SmallVector like above?  
There's only ever 1 instance of these per assembler, so there is no allocation cost. But will change them to SmallVector anyway.


================
Comment at: llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp:269
+    if (BufferName == "<inline asm>")
+      SkipTypeCheck = true;
   }
----------------
sbc100 wrote:
> Is this a permanent caveat or should we have a TODO here? 
This is intended to be fairly permanent. The assembler is run on the inline asm contents without any context of a function, or globals, or anything, which breaks everything this type checker does. We'd need to somehow recover all that information. I made inline asm "work" in the past because, why not, but I am not sure there's any legit uses for it in Wasm.


================
Comment at: llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp:1195
+    return false;
+  }
+
----------------
sbc100 wrote:
> All this new code here makes me wonder if the typechecker should be its own class with its own state.   Maybe not worth it?
Yup, that could probably be done? Lemme give that a go..


================
Comment at: llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp:1361
+  assert(false && "mnemonic not found");
+  return StringRef();
+}
----------------
sbc100 wrote:
> I surprised this functionality doesn't exist somewhere else already..
Nope.. I definitely looked.

This stuff is really ugly though, it does a linear search thru the table, and then the type checker above does linear if-checks against the resulting string.

Would be awesome to not do this.

We could check directly against the opcode, but there's one opcode per type (8 or so of them), and then of course occasionally we make new types. So this would make for a very fast but very fragile switch-case above.

3rd option is to generate some new table thru tablegen that has the exact mapping we need.

Opinions?


================
Comment at: llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp:366
+  for (const auto &F : M) {
+    if (F.hasAddressTaken()) {
+      MCSymbolWasm *FunctionTable =
----------------
sbc100 wrote:
> The old `HasAddressTakenFunction` was only set if `!F.isIntrinsic()` too.. is this change OK/indended?
not intended, good catch!


================
Comment at: llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp:528
+  AsmPrinter::emitLinkage(GV, Sym);
+  // This gets called before the function label and type are emitted.
+  // We use it to emit signatures of external functions.
----------------
sbc100 wrote:
> So other AsmPrinters use this print the linkage of a single symbol, but we are co-opting to print the all signatures?   
Yes, it was difficult to find a point where to emit this, since the constructor is executed way before anything is run, there's no `onStartOfFile` or similar, so this is the earliest part of emitting the first function which seemed most suitable.


================
Comment at: llvm/lib/Target/WebAssembly/WebAssemblyInstrTable.td:16
 multiclass TABLE<WebAssemblyRegClass rt> {
-  defm TABLE_GET_#rt : I<(outs rt:$res), (ins table32_op:$table),
+  defm TABLE_GET_#rt : I<(outs rt:$res), (ins table32_op:$table, I32:$i),
                          (outs), (ins table32_op:$table),
----------------
sbc100 wrote:
> I don't really grok this part of the change.   @pmatos will know more.   Can we split part out since it feels like its own bugfix.   How was it not breaking stuff?  (Or was it?)
It wasn't breaking much, since this was so far only used in asm. I'd rather keep this all in one place, showing the things the type-checker found.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104945/new/

https://reviews.llvm.org/D104945



More information about the llvm-commits mailing list