[PATCH] D62669: [ARM] Set up infrastructure for MVE vector instructions.

Wed Jun 12 02:28:39 PDT 2019

simon_tatham marked an inline comment as done.
simon_tatham added inline comments.

================
Comment at: llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp:6457
       Mask >>= 1;
-      if (ITMask[i - 1] == 't')
+      if (ITMask[i - 1] == (Mnemonic.startswith("vp") ? 'e' : 't'))
         Mask |= 8;
----------------
ostannard wrote:
> Why is this inverted compared to IT? Is the format of these masks written down anywhere?
Recording the result of our investigation just now:

It's IT, not VPT, that is doing something strange. The `ARMOperand` coming out of the operand parsing stage has the IT mask in one format; it is translated directly into an `MCOperand` storing the mask in the same format; but then `processInstruction` translates it into an `MCOperand` in a different format. For example, if you feed the instruction `itete eq` into `llvm-mc -debug -triple=thumbv8a -show-inst`, you see
```Parsed as: <MCInst #2921 t2IT <MCOperand Imm:0> <MCOperand Imm:5>>
        itete   eq                      @ <MCInst #2921 t2IT
                                        @  <MCOperand Imm:0>
                                        @  <MCOperand Imm:11>>
```
in which you can see that the mask is originally parsed into an `MCOperand` with value 5, and `processInstruction` later converts that into 11, without changing the type of structure it's stored in. So that `MCOperand` has different semantics depending on which part of the assembly pipeline you're in!

Apparently the reason for having two formats in the case of IT is because the transient initial format is what has to be passed to `startExplicitITBlock` in order to have the assembler statically track the IT block state and validate it against the suffixes on the next few instructions. So `processInstruction` receives the IT mask in the right format to pass to that function, and then transforms it into the second format.

The VPT code here is avoiding this complication, by using the same format for the mask throughout, in both the `ARMOperand` and `MCOperand` structures. So `vpstete`, for example, would have the mask value 11 from beginning to end of the parsing process. (The //architectural// encoding is different, but that's taken care of by `getVPTMaskOpValue` at the point of translation from `MCOperand` to an instruction encoding, which seems to me the sensible place to do it.)

I think it's very confusing for the immediate value in an `MCOperand` to have different semantics at different points in the code, so I propose to make `IT` work more like `VPT` rather than vice versa: I think //all// masks in operand structures should be stored in the same format, and `processInstruction` will have to do a local translation //into// the format that `startExplicitITBlock` expects.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62669/new/

https://reviews.llvm.org/D62669