[llvm] [AMDGPU] Fold multiple aligned v_mov_b32 to v_mov_b64 on gfx942 (PR #138843)

Mon Aug 11 06:58:11 PDT 2025

JanekvO wrote:

> We probably should be selecting these cases directly to the pseudo up-front. It's easier to fold subregister uses of the wider def than this

I've been trying to do so in #145052 but one of the cases I could not cover in that one was the non-power-of-2 vector size cases that have a non-64b remainder. For example, v3i32 should become a 64b mov + 32b mov which this PR should allow compared to #145052

https://github.com/llvm/llvm-project/pull/138843