[all-commits] [llvm/llvm-project] 088db8: [RISCV] Merge shuffle sources if lanes are disjoin...
Luke Lau via All-commits
all-commits at lists.llvm.org
Wed Dec 11 21:41:49 PST 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 088db868f3370ffe01c9750f75732679efecd1fe
https://github.com/llvm/llvm-project/commit/088db868f3370ffe01c9750f75732679efecd1fe
Author: Luke Lau <luke at igalia.com>
Date: 2024-12-12 (Thu, 12 Dec 2024)
Changed paths:
M llvm/lib/Target/RISCV/RISCVISelLowering.cpp
M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-shuffles.ll
M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-shuffles.ll
M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-interleaved-access.ll
M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-deinterleave.ll
Log Message:
-----------
[RISCV] Merge shuffle sources if lanes are disjoint (#119401)
In x264, there's a few kernels with shuffles like this:
%41 = add nsw <16 x i32> %39, %40
%42 = sub nsw <16 x i32> %39, %40
%43 = shufflevector <16 x i32> %41, <16 x i32> %42, <16 x i32> <i32 11,
i32 15, i32 7, i32 3, i32 26, i32 30, i32 22, i32 18, i32 9, i32 13, i32
5, i32 1, i32 24, i32 28, i32 20, i32 16>
Because this is a complex two-source shuffle, this will get lowered as
two vrgather.vvs that are blended together.
vadd.vv v20, v16, v12
vsub.vv v12, v16, v12
vrgatherei16.vv v24, v20, v10
vrgatherei16.vv v24, v12, v16, v0.t
However the indices coming from each source are disjoint, so we can
blend the two together and perform a single source shuffle instead:
%41 = add nsw <16 x i32> %39, %40
%42 = sub nsw <16 x i32> %39, %40
%43 = select <0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,1> %41, %42
%44 = shufflevector <16 x i32> %43, <16 x i32> poison, <16 x i32> <i32
11, i32 15, i32 7, i32 3, i32 10, i32 14, i32 6, i32 2, i32 9, i32 13,
i32 5, i32 1, i32 8, i32 12, i32 4, i32 0>
The select will likely get merged into the preceding instruction, and
then we only have to do one vrgather.vv:
vadd.vv v20, v16, v12
vsub.vv v20, v16, v12, v0.t
vrgatherei16.vv v24, v20, v10
This patch bails if either of the sources are a broadcast/splat/identity
shuffle, since that will usually already have some sort of cheaper
lowering.
This improves performance on 525.x264_r by 4.12% with -O3 -flto
-march=rva22u64_v on the spacemit-x60.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list