[llvm] [AMDGPU] Allow folding of non-subregs through REG_SEQUENCE (PR #151033)
Josh Hutton via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 14 18:20:59 PDT 2025
================
@@ -1465,6 +1477,33 @@ void SIFoldOperandsImpl::foldOperand(
return;
}
+ if (!FoldingImmLike && OpToFold.isReg() && ST->needsAlignedVGPRs()) {
----------------
JoshHuttonCode wrote:
Due to [the way that subregisters are handled](https://github.com/llvm/llvm-project/pull/151033/files#diff-c043c2654ff1fd5fcac955ed8478d5a2966760bf806a7fa5162923dc9e0ea131R1224-R1225) when folding the source of a copy through REG_SEQUENCE, we only ever try to fold if the subregister operand in the REG_SEQUENCE exactly matches the subregister of the use of the REG_SEQUENCE.
To make sure I understand, are the high 32-bits of the 64-bit read required to be undef, or are they just discarded? If it is required, then I am not sure if we should be folding into these instructions anyway. If it is not, then I think we would never fold the 32-bit operand into the 64-bit operand (maybe with special casing the REG_SEQUENCE recursion for these instructions, along with special casing on the Register Class constraining).
https://github.com/llvm/llvm-project/pull/151033
More information about the llvm-commits
mailing list