[PATCH] D144002: [RISCV] Add vendor-defined XTheadMemPair (two-GPR Memory Operations) extension

Wed Feb 15 07:55:35 PST 2023

mtsamis marked an inline comment as done.
mtsamis added inline comments.

================
Comment at: llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp:2672
+    // The last operand must be constant 3 or 4 depending on the data width.
+    if (IsWordOp && Inst.getOperand(4).getImm() != 3) {
+      SMLoc Loc = Operands.back()->getStartLoc();
----------------
philipp.tomsich wrote:
> mtsamis wrote:
> > craig.topper wrote:
> > > craig.topper wrote:
> > > > Why does this 3rd operand exist at all?
> > > Oops, I meant fifth operand.
> > Although I also don't see why this required, I implemented it as per the specification
> The problem really is the Specification on this one:
>    th.ldd rd1,rd2,(rs1),#imm2,4
> appears to have 4 as an operand (even though it is the only permissible value and does not get encoded explicitly).
> 
> However, looking at it from another angle the ",4" is part of the operation mnemonic (or an %_ term when expressing the pattern in GNU notation):
>    th.ldd %1,%2,(%0),%3,4
> Given that the 4 is not encoded, I am leaning with Craig at the moment and consider it as part of the mnemonic (i.e., a split off fragment of 'th.ldd'+',4').
> 
> I would much rather have seen this specified as
>     th.ldd rd1,rd2,#offset(rs1)
> where offset would have a constraint to have its lower 4 bits cleared.
> 
> This would be a much more meaningful assembler syntax.
> Can we have this as a pseudo (that also becomes the preferred disassembly)?
> I'll volunteer to send out a PR against the T-Head specification to add that additional assembler-pseudo under the th.ldd definition…
>From what I understand, even if we have an additional assembler-pseudo for a nicer syntax we would still have to support the current one from the specification. In that case, adding the 'nice' syntax won't make the implementation better because we'll have to support the other one as well. The "4" is indeed not encoded and could be part of the assembler string as + ", 4" but from what I understood when implementing it, this doesn't work great with the current infrastructure. You can also check this discussion about implicit operands from which I got ideas how to properly implent them: [[ https://discourse.llvm.org/t/mc-layer-parsing-disassembling-implicit-operands/49816 | mc-layer-parsing-disassembling-implicit-operands ]]. The constant assembler string makes the assembler not able to properly parse the instruction and also makes it hard to get proper error messages as in this implementation.

Tl;dr My personal conclusion is that if we have to support the current syntax then we don't benefit from creating an alternative nicer syntax.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144002/new/

https://reviews.llvm.org/D144002