[llvm] [AMDGPU] SelectionDAG support for vector type set 0 to multiple sgpr64 (PR #128017)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 21 05:59:12 PST 2025


arsenm wrote:

> > You also have a number of regressions in the VALU cases. Should do something about those (which may mean not doing this for VALU values).
> 
> Looking into it

I am speculating that there are several things going on, some of which are going to be improved by a set of patches I'm working on which is taking significantly longer than I expected. Related to that, I was thinking about other ways we can improve the immediate folding.

I've been working on enabling MachineCSE to work for subregister extracts, and improving folds of copies through sub registers. In doing this, I see a number of places where we get better coalescing and re-assemble materialize of 64-bit inline immediates.

I've been thinking we should teach foldImmediate and/or SIFoldOperands to reconstruct 64-bit moves. e.g. foldImmediate can see the use instruction is a reg_sequence with the same register used twice, and replace it with an s_mov_b64. It also currently skips any constants with multiple uses, but the correct heuristic is probably more refined (like only skip multiple uses for non inline-immediate)




https://github.com/llvm/llvm-project/pull/128017


More information about the llvm-commits mailing list