[llvm] [AMDGPU] Don't fold an i64 immediate value if it can't be replicated from its lower 32-bit (PR #168458)

Mon Nov 17 15:51:10 PST 2025

================
@@ -766,6 +771,34 @@ static void appendFoldCandidate(SmallVectorImpl<FoldCandidate> &FoldList,
                       FoldCandidate(MI, OpNo, FoldOp, Commuted, ShrinkOp));
 }
 
+// Returns true if the instruction is a packed f32 instruction that only reads
+// 32 bits from a scalar operand (SGPR or literal) and replicates the bits to
+// both channels.
+static bool isPKF32Instr(const GCNSubtarget *ST, MachineInstr *MI) {
+  if (!ST->hasPKF32Insts())
+    return false;
+  switch (MI->getOpcode()) {
+  case AMDGPU::V_PK_ADD_F32:
----------------
shiltian wrote:

But not all `v2f32` instructions have this issue. Only those with `OPF_PK_F32` flag.

https://github.com/llvm/llvm-project/pull/168458