[PATCH] D138353: [Passes][VectorCombine] enable early run generally and try load folds
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Nov 19 08:15:39 PST 2022
spatel added inline comments.
================
Comment at: llvm/lib/Passes/PassBuilderPipelines.cpp:618-619
- // The matrix extension can introduce large vector operations early, which can
- // benefit from running vector-combine early on.
- if (EnableMatrix)
- FPM.addPass(VectorCombinePass(/*TryEarlyFoldsOnly=*/true));
+ // Try vectorization/scalarization transforms that are likely to be reduced by
+ // GVN and InstCombine.
+ FPM.addPass(VectorCombinePass(/*TryEarlyFoldsOnly=*/true));
----------------
lebedev.ri wrote:
> What does "reduced" here mean? "obscured"?
No, that was supposed to mean "enable more folds".
In the motivating example from #17113, we have:
```
%2 = load float, ptr %0, align 16
%3 = insertelement <4 x float> undef, float %2, i64 0
%4 = getelementptr inbounds [4 x float], ptr %0, i64 0, i64 1
%5 = load float, ptr %4, align 4
```
VectorCombine can widen the first load (with legality/profitability constraints):
```
%2 = load <4 x float>, ptr %0, align 16
%3 = shufflevector <4 x float> %2, <4 x float> poison, <4 x i32> <i32 0, i32 undef, i32 undef, i32 undef>
%4 = getelementptr inbounds [4 x float], ptr %0, i64 0, i64 1
%5 = load float, ptr %4, align 4
```
And GVN then replaces the redundant 2nd load:
```
%2 = load <4 x float>, ptr %0, align 16
%3 = shufflevector <4 x float> %2, <4 x float> poison, <4 x i32> <i32 0, i32 undef, i32 undef, i32 undef>
%4 = getelementptr inbounds [4 x float], ptr %0, i64 0, i64 1
%5 = bitcast <4 x float> %2 to i128
%6 = lshr i128 %5, 32
%7 = trunc i128 %6 to i32
%8 = bitcast i32 %7 to float
```
And then InstCombine manages to remove all of those extra instructions.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D138353/new/
https://reviews.llvm.org/D138353
More information about the llvm-commits
mailing list