[llvm] [AMDGPU] Fix encoding of VOP3P dpp on GFX11 and GFX12 (PR #82710)
Stanislav Mekhanoshin via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 23 12:45:19 PST 2024
rampitec wrote:
> I don't think dpp is a special case. This is how it should work for both vop3p and vop3p with dpp encoding:
>
> https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna3-shader-instruction-set-architecture-feb-2023_0.pdf
>
> From the packed math section 7.5 of the programming guide table "If either the source operand or destination operand is 32bits, the corresponding OPSEL bit must set to zero." "If either the source operand or destination operand is 32bits or is a constant, the corresponding OPSEL_HI bit must set to zero" For v_dot2_F32_f16 and v_dot2_f32_bf16 src2 is an f32. **Therefore the OPSEL_HI_2 bit of these should be 0**.
>
> For op_sel and op_sel_hi for src0 and src1, the only description I see for what those bits will do for these instructions is for inline constants in section 7.5.1. DOT2_F32_BF16 says it ignores OPSEL, so it doesn't matter. DOT2_F32_F16 says "use FP32 inline, supports OPSEL" So I think the baseline vop3p behavior you have enabled here for those operands is correct.
It is the same as SP3 and it is the same as base VOP3P without DPP. That is just DPP was missing these bits.
https://github.com/llvm/llvm-project/pull/82710
More information about the llvm-commits
mailing list