[all-commits] [llvm/llvm-project] 4e1bdb: AMDGPU: Custom lower 32-bit element shuffles
Matt Arsenault via All-commits
all-commits at lists.llvm.org
Tue Jan 21 09:01:07 PST 2025
Branch: refs/heads/users/arsenm/custom-lower-vector-shuffle-i32-elements
Home: https://github.com/llvm/llvm-project
Commit: 4e1bdb492246adacd90c42b0590613e040268e1a
https://github.com/llvm/llvm-project/commit/4e1bdb492246adacd90c42b0590613e040268e1a
Author: Matt Arsenault <Matthew.Arsenault at amd.com>
Date: 2025-01-21 (Tue, 21 Jan 2025)
Changed paths:
M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
Log Message:
-----------
AMDGPU: Custom lower 32-bit element shuffles
This is so we can try to make use of v_pk_mov_b32 when available.
Note this currently has little observable effect. The combiner
will undo the common extract of shuffle pattern. The lack
of test changes should demonstrate this change is minimally
correct.
We should probably try to make better use of wider extracts in
even aligned cases, but I'm trying to avoid some really ugly
regalloc regressions in some MFMA tests. The DAG scheduler ends
up doing a worse job if we use vector extracts, resulting
in failure to do 3 address conversion of MFMAs.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list