[llvm] [AMDGPU] Allow folding of non-subregs through REG_SEQUENCE (PR #151033)

Tue Oct 14 18:20:59 PDT 2025

================
@@ -1465,6 +1477,33 @@ void SIFoldOperandsImpl::foldOperand(
       return;
   }
 
+  if (!FoldingImmLike && OpToFold.isReg() && ST->needsAlignedVGPRs()) {
----------------
JoshHuttonCode wrote:

Due to [the way that subregisters are handled](https://github.com/llvm/llvm-project/pull/151033/files#diff-c043c2654ff1fd5fcac955ed8478d5a2966760bf806a7fa5162923dc9e0ea131R1224-R1225) when folding the source of a copy through REG_SEQUENCE, we only ever try to fold if the subregister operand in the REG_SEQUENCE exactly matches the subregister of the use of the REG_SEQUENCE.

To make sure I understand, are the high 32-bits of the 64-bit read required to be undef, or are they just discarded? If it is required, then I am not sure if we should be folding into these instructions anyway. If it is not, then I think we would never fold the 32-bit operand into the 64-bit operand (maybe with special casing the REG_SEQUENCE recursion for these instructions, along with special casing on the Register Class constraining).

https://github.com/llvm/llvm-project/pull/151033