[llvm] [AMDGPU] Don't fold an i64 immediate value if it can't be replicated from its lower 32-bit (PR #168458)
Stanislav Mekhanoshin via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 18 11:50:20 PST 2025
================
@@ -1420,6 +1420,13 @@ class GCNSubtarget final : public AMDGPUGenSubtargetInfo,
/// \returns true if the target has instructions with xf32 format support.
bool hasXF32Insts() const { return HasXF32Insts; }
+ /// \returns true if the target has packed f32 instructions that only read 32
+ /// bits from a scalar operand (SGPR or literal) and replicates the bits to
+ /// both channels.
+ bool hasPKF32InstsReplicatingLow32BitsOfScalarInput() const {
+ return getGeneration() == GFX12 && GFX1250Insts;
----------------
rampitec wrote:
GFX1250Insts is sufficient.
https://github.com/llvm/llvm-project/pull/168458
More information about the llvm-commits
mailing list