[clang] [llvm] [clang-tools-extra] [AMDGPU] Fix folding of v2i16/v2f16 splat imms (PR #72709)

Stanislav Mekhanoshin via cfe-commits cfe-commits at lists.llvm.org
Tue Nov 28 12:27:36 PST 2023


rampitec wrote:

After some digging I believe with this bug fixed we are fine now. Since we are passing all bf16 inputs as i16 we can only inline small integers, and inline integer 1 shall be the same as using 1 in an input register I believe. Although we are missing a potential optimization, say we could fold 'i16 0x3f80' as inline constant 1.0, and a pair of these as 1.0 with opsel should we know this is really a bf16 operand.

https://github.com/llvm/llvm-project/pull/72709


More information about the cfe-commits mailing list