[llvm] [AMDGPU][AMDGPULateCodeGenPrepare] Combine scalarized selects back into vector selects (PR #173990)

Tue Dec 30 08:02:20 PST 2025

================
@@ -551,6 +593,113 @@ bool AMDGPULateCodeGenPrepare::visitLoadInst(LoadInst &LI) {
   return true;
 }
 
+bool AMDGPULateCodeGenPrepare::tryCombineSelectsFromBitcast(BitCastInst &BC) {
+  auto *SrcVecTy = dyn_cast<FixedVectorType>(BC.getSrcTy());
+  auto *DstVecTy = dyn_cast<FixedVectorType>(BC.getDestTy());
+  if (!SrcVecTy || !DstVecTy)
+    return false;
+
+  // Must be: bitcast <N x i32> to <M x i8>
+  if (!SrcVecTy->getElementType()->isIntegerTy(32) ||
+      !DstVecTy->getElementType()->isIntegerTy(8))
----------------
PankajDwivedi-25 wrote:

> Is it possible to have a `<4 x float>` bitcasted to `<16 x i8>` ?

I think yes, it should work with minimal change in the present code. I just limited it to i32 to i8.

https://github.com/llvm/llvm-project/pull/173990