[PATCH] D129637: [AMDGPU][MC][GFX11] Correct disassembly of *_e64_dpp opcodes which support op_sel

Wed Jul 13 08:21:17 PDT 2022

dp added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp:782
+             AMDGPU::getNamedOperandIdx(Opc, AMDGPU::OpName::op_sel) != -1) {
+    insertNamedMCOperand(MI, MCOperand::createImm(getOpSel(MI)),
+                         AMDGPU::OpName::op_sel);
----------------
Joe_Nash wrote:
> Petar.Avramovic wrote:
> > You don't need to calculate constant to put in opsel operand in disassembler, opsel bits are already set in src_modifiers, inst_printer does not access op_sel operand. You can simply use 0.
> > You only need to add op_sel operand so that other two operands dpp8 and fi are at correct index.
> > 
> > AsmParser uses op_sel operand to parse opsel and then uses that operand to set opsel bit in src_modifiers. Because Srco operands are parsed first, opsel is parsed after all src operands. After opsel bits in src modifiers are set (e.g. after cvtVOP3OpSel or cvtVOP3P) constant value inside opsel operand could be set to 0 since it would not be used any more.
> It is technically true setting op_sel operand to 0 will not affect the disassembler output. However, it will put the operands in an inconsistent state with respect to each other, and for instance when you look at this in the debugger it is very confusing. So while it can save some time/complexity, I think its probably worth it to make all operands correct. 
Looks like you are right, `op_sel=0` would suffice for correct printing. I did not know about that, thanks.

================
Comment at: llvm/lib/Target/AMDGPU/VOPInstructions.td:1289
     : VOP3_DPP8<op, opName, ps.Pfl> {
+  let VOP3_OPSEL = ps.Pfl.HasOpSel;
   let hasSideEffects = ps.hasSideEffects;
----------------
Petar.Avramovic wrote:
> Is this to allow printing of opsel[3] (SISrcMods::DST_OP_SEL in src modifiers)?
> btw opsel is not properly parsed for dot8/dot16 (cvtVOP3DPP)
Correct. WIthout this change `op_sel` is printed without dst bit.

I know that there are some issues with `v_dot2_bf16_bf16_e64_dpp`, `v_dot2_f16_f16_e64_dpp`, `v_dot2_f32_f16_e64_dpp` and `v_dot2_f32_bf16_e64_dpp`. My tests fail for these opcodes, but I do not know the root cause.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129637/new/

https://reviews.llvm.org/D129637