[PATCH] D44329: [WebAssembly] Added initial AsmParser implementation.

Jacob Gravelle via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 15 15:27:57 PDT 2018


jgravelle-google added inline comments.


================
Comment at: lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp:249
+        return Error("Cannot parse stack register index: ", StackRegTok);
+      unsigned RegNo = static_cast<unsigned>(ParsedRegNo);
+      if (StackOp == "push") {
----------------
aardappel wrote:
> dschuff wrote:
> > aardappel wrote:
> > > dschuff wrote:
> > > > Up until now, the stack "reg" numbers on the "$push" and "$pop" tokens were really just annotations for readability, (because of course the operands push and pop on the stack in a defined order). So far nothing has actually depended on those notations, and this seems to be adding that dependence. Not that we absolutely can't do that; but it would be nice if we didn't have to. It looks like it might make this code more awkward though, because it appears that we are just getting the lexer tokens in order, and we'd maybe have to buffer them or something before processing them in right-to-left order. Am I understanding this correctly?
> > > The generated matcher does need registers in pretty much the same locations as these, so thats why it seemed natural to follow along. Certainly the number part is non-essential, since I am essentially re-using the assignment from the disassembler, though to accept code that does not have these numbers, I presume I would have to track a stack to able to assign them myself. That's not hard, but also not entirely trivial, so that may be a nice thing to do for a followup. In fact, I think it be nice to be able to not require these at all, and completely skip them on parse, and have a mode (initially for the non-elf path) that makes the dissassembler not output them either. Opinions welcome.
> > Yeah, that was sort of what I was getting at. s2wasm ignores the numbers entirely and tracks the stack itself. I do think it would be nice to not need them at all.
> > 
> > I think It sort of comes down to what we want the assembly language to look like. The non-ELF path will soon be the only path, so we don't need to worry about it for the long term. I think my preference for the language would be to keep it simpler and not require the numbers. This would require the assembler to track the stack, but even if it used the numbers, it would probably need to validate them anyway so overall it wouldn't be much different. I do actually like having the numbers as an annotation though (it makes the asm easier to read), and there is a way to attach annotations to assembly output, so maybe we should use it for that.
> agreed. I think it should be an option for the dissassembler (I personally would prefer to read the code without them), and the assemble should permit but ignore them.
> 
> Tracking the stack myself may be a bit more interesting than usual, since the current assembler code has no idea of instruction semantics, it leaves all of that to the table-gen matcher. So I'd need to at least dip into the table-gen to extract information on instruction signature if I  want to avoid hard-coding any of that.
To throw my own opinions in the mix,
Design goals I think are important:
1) the .s syntax should be easy to read
2) it should support inline asm fairly easily

To that end we could take and existing (wasm-elf) .s:

```
i32.const $push3=, 0
i32.load  $push2=, __stack_pointer($pop3)
i32.const $push4=, 16
i32.sub   $1=, $pop2, $pop4
```

and write it as
```
i32.const 0
i32.load  offset=__stack_pointer
i32.const 16
i32.sub
set_local 1
```
which is generally identical to .wat. We (in principle) can do this because the push/pop annotations are essentially cosmetic: we know them statically on a per-opcode basis. (This is almost certainly less the case for function calls.)
This should be the easiest thing to write inline asm for, and map most directly to the official text format, which should help the learning curve.

As a secondary idea, we keep the numbers, but observe that we can reuse virtual registers / stack slots after popping. Then the number refers to the depth of the stack at any given point. This also serves as a proxy for register pressure in the engines.
```
i32.const $push0=, 0
i32.load  $push0=, __stack_pointer($pop0)
i32.const $push1=, 16
i32.sub   $1=, $pop0, $pop1
```



https://reviews.llvm.org/D44329





More information about the llvm-commits mailing list