[llvm] [AMDGPU] Fix folding of v2i16/v2f16 splat imms (PR #72709)

Stanislav Mekhanoshin via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 27 12:29:40 PST 2023


rampitec wrote:

> > > We can use inline constants with packed 16-bit operands, but these should use op_sel.
> > 
> > 
> > The docs say that there are special rules for this for f16/bf16 dot2 instructions - see the RDNA3 Instruction Set Architecture Reference Guide section 7.2.1. "Non-Standard Uses of Operand Fields" subsection "Inline constants with DOT2_F16_F16 and DOT2_BF16_BF16".
> > So does this code need to treat f16/bf16 dot2 instructions differently from other packed instructions?
> 
> This needs a knowledge of the instruction to decide if a constant is legal or not, and the operand number. The whole infrastructure does not know it. Moreover, it does not help the current constant bus violation, it is still a violation because inline constant is not used.
> 
> When this is fixed one could potentially use this picularity (only and only on gfx11), although I do not know why. Using op_sel does not make anything worse, so I see no single reason to distinguish instructions and their operand number for that reason.

In addition ISA reference is weird here. It says that for src2 OPSEL shall be used to control replication, but src2 is not packed, instructions only read low 16 bit of it.

Anyhow, to me the issue of low bits auto replication and emission of invalid 32-bit literal thinking it is an inline immediate are orthogonal.

https://github.com/llvm/llvm-project/pull/72709


More information about the llvm-commits mailing list