[llvm] [AMDGPU][SDAG] Legalise v2i32 or/xor/and instructions to make use of 64-bit wide instructions (PR #140694)

Sat Aug 16 02:30:50 PDT 2025

================
@@ -4048,6 +4048,59 @@ SDValue AMDGPUTargetLowering::splitBinaryBitConstantOpImpl(
   return DAG.getNode(ISD::BITCAST, SL, MVT::i64, Vec);
 }
 
+// Part of the shift combines is to optimise for the case where its possible
+// to reduce e.g shl64 to shl32 if shift range is [63-32]. This
+// transforms: DST = shl i64 X, Y to [0, srl i32 X, (Y & 31) ]. The
+// '&' is then elided by ISel. The vector code for this was being
+// completely scalarised by the vector legalizer, but when v2i32 is
+// legal the vector legaliser only partially scalarises the
+// vector operations and the and is not elided. This function
----------------
chrisjbris wrote:

I'm beginning to think that scalarising the AND instruction like this may not be the most intuitive or sensible approach and that it might be worth finding some other means of ensuring it is elided, as intended by the shift optimisation.

https://github.com/llvm/llvm-project/pull/140694