[PATCH] D129084: [AMDGPU] gfx11 Fix disassembler for VOP3 dpp8
Joe Nash via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jul 11 11:14:48 PDT 2022
Joe_Nash added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp:8805
cvtVOP3P(Inst, Operands, OptionalIdx);
+ else if (Desc.TSFlags & SIInstrFlags::VOP3)
+ cvtVOP3P(Inst, Operands, OptionalIdx);
----------------
The conditional check here should really be checking for is DOT2_BF16_BF16_e64_dpp_gfx11 || DOT2_BF16_BF16_e64_dpp8_gfx11 || DOT2_F16_F16 ....
There is a similar conditional check in the Disassembler where we need to call convertVOP3PDPP.
Options: 1) check if Opc = the specific dot instructions 2) add a tablegen mapping helper table to check if something is one of the relevant instructions 3)
Since we are relying on checking the presence of operands in cvtVOP3P anyway, maybe we can unconditionally call cvtVOP3P here, and rename the function to cvtVOPModsHelper or something.
================
Comment at: llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp:763
+ AMDGPU::getNamedOperandIdx(Opc, AMDGPU::OpName::op_sel) != -1)
+ insertNamedMCOperand(MI, MCOperand::createImm(0), AMDGPU::OpName::op_sel);
} else {
----------------
This will set op_sel to the wrong value if it was input as 1. We need to call convertVOP3PDPPInst to copy the operand correctly, or something equivalent.
However, that defect does not appear to effect the output, because the op_sel operand does not set any bits in the output machine code.
================
Comment at: llvm/lib/Target/AMDGPU/VOP3Instructions.td:705
+ // Type calculations workaround for bf16.
+ let HasSrc0Mods = 1;
+ let HasSrc1Mods = 1;
----------------
DOT2_BF16_BF16_e64_dpp should disallow modifiers on src0 and src1, though DOT2_BF16_BF16_e64 allows them. They should allow op_sel on dst, but this may be an issue since the bit for that is in src0 modifiers.
DOT2_F16_F16_e64_dpp should allow abs and neg modifiers on all operands, but op_sel only on dst and src2.
Setting HasSrcMods effects codegen, and dpp16. Can you please include the codegen test I proposed or some variation on it? Can you also include asm and disasm tests for dpp16?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D129084/new/
https://reviews.llvm.org/D129084
More information about the llvm-commits
mailing list