[PATCH] D44329: [WebAssembly] Added initial AsmParser implementation.

Mon Mar 12 17:01:10 PDT 2018

aardappel marked 2 inline comments as done.
aardappel added inline comments.

================
Comment at: lib/MC/MCExpr.cpp:402
 }

 bool MCExpr::evaluateAsAbsolute(int64_t &Res, const MCAssembler &Asm) const {
   return evaluateAsAbsolute(Res, &Asm, nullptr, nullptr);
----------------
dschuff wrote:
> Wow, is it just me or did the diff engine really screw this file up :/
whatever diff this uses seems generally much worse than github/gerrit/critique..

================
Comment at: lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp:34
+
+#define DEBUG_TYPE "wasm-asmparser"
+
----------------
jgravelle-google wrote:
> Nit: I feel this should be "wasm-asm-parser"? The other targets define this as:
> 
> 
> > lib/Target/ARM/AsmParser/ARMAsmParser.cpp:#define DEBUG_TYPE "asm-parser"
> > lib/Target/Mips/AsmParser/MipsAsmParser.cpp:#define DEBUG_TYPE "mips-asm-parser"
> > lib/Target/AVR/AsmParser/AVRAsmParser.cpp:#define DEBUG_TYPE "avr-asm-parser"
> > lib/Target/Hexagon/AsmParser/HexagonAsmParser.cpp:#define DEBUG_TYPE "mcasmparser"
> 
> 
> 
> so it probably doesn't really matter.
I'll change it anyway, seems more logical to have a `-`

================
Comment at: lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp:215
+        .Case("f64", WebAssembly::F64_0)
+        .Case("i8x16", WebAssembly::V16I8_0)
+        .Case("i16x8", WebAssembly::V8I16_0)
----------------
dschuff wrote:
> There are a couple of confusing things to me about this. First, the SIMD proposal has only one value type ([[ https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#simd-value-type | v128 ]]) and the wasm backend has a corresponding register class ([[ https://github.com/llvm-mirror/llvm/blob/master/lib/Target/WebAssembly/WebAssemblyRegisterInfo.td#L61 | V128 ]]) into which all the SIMD MVTs (v4f32 et al) can go. But in the assembly language the scalar instructions use the value type name in their mnemonic and the SIMD instructions use the lane division interpretation names. So that's a bit of inherent weirdness that will have to be reflected in this code somewhere. But the other thing is that this function is named `ParseRegType` which to me would indicate the type of register (i.e. register class which in C++ code would be e.g. `WebAssembly::I32RegClass`), but what it returns is actually a physical register number. What we maybe actually want is to reflect either a wasm value type (although it looks like that's maybe not actually even reflected directly in the MIR, just in the binary format) or an MVT (e.g. `MVT::i32` or `MVT::v4f32`). 
> 
> 
> ... ok actually looking at the way these reg types are used I now actually understand the comment you wrote in `WebAssemblyOperand::RegOp`.  Not sure I have a better idea right now. I guess it's because the generated matcher expects the reg operand to be one of the physical registers and validates that?
Yes, this was originally to please the generated matcher. Though looking at the code, it seems the different v128 layouts are only needed for `emitParam` / `emitLocal`, whereas the matcher would be fine with generic registers. I can try splitting these up and reverting to a single physical register type, and see if it still works.

================
Comment at: lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp:249
+        return Error("Cannot parse stack register index: ", StackRegTok);
+      unsigned RegNo = static_cast<unsigned>(ParsedRegNo);
+      if (StackOp == "push") {
----------------
dschuff wrote:
> Up until now, the stack "reg" numbers on the "$push" and "$pop" tokens were really just annotations for readability, (because of course the operands push and pop on the stack in a defined order). So far nothing has actually depended on those notations, and this seems to be adding that dependence. Not that we absolutely can't do that; but it would be nice if we didn't have to. It looks like it might make this code more awkward though, because it appears that we are just getting the lexer tokens in order, and we'd maybe have to buffer them or something before processing them in right-to-left order. Am I understanding this correctly?
The generated matcher does need registers in pretty much the same locations as these, so thats why it seemed natural to follow along. Certainly the number part is non-essential, since I am essentially re-using the assignment from the disassembler, though to accept code that does not have these numbers, I presume I would have to track a stack to able to assign them myself. That's not hard, but also not entirely trivial, so that may be a nice thing to do for a followup. In fact, I think it be nice to be able to not require these at all, and completely skip them on parse, and have a mode (initially for the non-elf path) that makes the dissassembler not output them either. Opinions welcome.

================
Comment at: lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp:291
+    Parser.Lex();
+    return false;
+  }
----------------
aardappel wrote:
> dschuff wrote:
> > jgravelle-google wrote:
> > > Wait, is truthiness here used to represent failure? That's... confusing.
> > > Oh, that's overridden from `MCTargetAsmParser::ParseInstruction`, cool.
> > There are a lot of places in LLVM backends that return true for failure and false for success ¯\_(ツ)_/¯
> > edit: actually there doesn't appear to be a failure path in this function, but its caller below clearly expects that possibility?
> Yup, just following along with what LLVM does here.
I'm guessing this originally had code that could fail parsing in it.. I'll make it void.

https://reviews.llvm.org/D44329