[llvm] [AMDGPU] SelectionDAG support for vector type set 0 to multiple sgpr64 (PR #128017)
Janek van Oirschot via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 10 07:15:31 PDT 2025
JanekvO wrote:
> I've been looking at resolving the regression but having a hard time finding a good MachineCSE alternative. Tried to convince the complex pattern to associate the added `v_mov_b32 0` explicitly with a vector element of `i64 = buildvector i32 0, i32 0`, if possible. Handling in any of the constant folding mechanisms (peepholeopt, SIFoldOperand) is a bit awkward with the fold possibly being between 2 `set 0` expressions that don't have a common ancestor expression. A blanket ban on any vgpr destination isn't preferred either as I have a follow-up patch that aids in folding those into `v_mov_b64 0`, if applicable.
Any ideas for alternatives to implementing subreg awareness to MachineCSE for the regression of an additional vgpr use in cases like `fcanonicalize.ll`? I cannot seem to find an appropriate alternative that avoids this (I'm wondering whether it's worth continuing this PR if the regression cannot be fixed; would it be an acceptable regression?)
https://github.com/llvm/llvm-project/pull/128017
More information about the llvm-commits
mailing list