[llvm] [AMDGPU] Fix encoding of VOP3P dpp on GFX11 and GFX12 (PR #82710)

Fri Feb 23 08:48:45 PST 2024

Sisyph wrote:

I don't think dpp is a special case. This is how it should work for both vop3p and vop3p with dpp encoding:

https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna3-shader-instruction-set-architecture-feb-2023_0.pdf

>From the packed math section 7.5 of the programming guide table
"If either the source operand or destination operand is 32bits, the corresponding OPSEL bit must set
to zero."
"If either the source operand or destination operand is 32bits or is a constant, the corresponding OPSEL_HI
bit must set to zero"
For v_dot2_F32_f16 and v_dot2_f32_bf16 src2 is an f32. **Therefore the OPSEL_HI_2 bit of these should be 0**.

For op_sel and op_sel_hi for src0 and src1, the only description I see for what those bits will do for these instructions is for inline constants in section 7.5.1. 
DOT2_F32_BF16 says it ignores OPSEL, so it doesn't matter. 
DOT2_F32_F16 says "use FP32 inline, supports OPSEL"
So I think the baseline vop3p behavior you have enabled here for those operands is correct.

https://github.com/llvm/llvm-project/pull/82710