[PATCH] D111800: [VectorCombine] Add option to only run scalarization transforms.
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 14 13:10:15 PDT 2021
spatel added a comment.
In D111800#3064694 <https://reviews.llvm.org/D111800#3064694>, @fhahn wrote:
> In D111800#3064573 <https://reviews.llvm.org/D111800#3064573>, @spatel wrote:
>
>> What if we just did better in VectorCombine?
>>
>> We'd need to chain a bunch of combines together on paper or just implement this:
>> https://llvm.org/PR52178
>> ...to know if it gets us to the minimal set of shuffles in IR and/or codegen for the 'hadd' example, but it might be enough?
>
> Yes, I think that would get us a bit further, especially on the ARM64 test case. For X86 the shuffles/add chains are a bit more difficult to tackle: it converts the 4 scalar adds to 4 vector adds which each process a single lane. I'm not sure if we will be able to cover this in VectorCombine.
I just drafted a patch for PR51278, and it got the hadd example down to:
In D111800#3064694 <https://reviews.llvm.org/D111800#3064694>, @fhahn wrote:
> In D111800#3064573 <https://reviews.llvm.org/D111800#3064573>, @spatel wrote:
>
>> What if we just did better in VectorCombine?
>>
>> We'd need to chain a bunch of combines together on paper or just implement this:
>> https://llvm.org/PR52178
>> ...to know if it gets us to the minimal set of shuffles in IR and/or codegen for the 'hadd' example, but it might be enough?
>
> Yes, I think that would get us a bit further, especially on the ARM64 test case. For X86 the shuffles/add chains are a bit more difficult to tackle: it converts the 4 scalar adds to 4 vector adds which each process a single lane. I'm not sure if we will be able to cover this in VectorCombine.
I drafted a patch for PR52178 , and I see that we get the first pair folded, but we're stuck on the next pair:
define <4 x float> @reverse_hadd_v4f32(<4 x float> %a, <4 x float> %b) local_unnamed_addr #0 {
%shift = shufflevector <4 x float> %a, <4 x float> poison, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
%shift1 = shufflevector <4 x float> %a, <4 x float> poison, <4 x i32> <i32 undef, i32 undef, i32 3, i32 undef>
%1 = shufflevector <4 x float> %shift, <4 x float> %shift1, <4 x i32> <i32 undef, i32 undef, i32 6, i32 0>
%2 = shufflevector <4 x float> %a, <4 x float> poison, <4 x i32> <i32 undef, i32 undef, i32 2, i32 0>
%3 = fadd <4 x float> %1, %2
%shift2 = shufflevector <4 x float> %b, <4 x float> poison, <4 x i32> <i32 undef, i32 0, i32 undef, i32 undef>
%4 = fadd <4 x float> %shift2, %b
%5 = shufflevector <4 x float> %3, <4 x float> %4, <4 x i32> <i32 undef, i32 5, i32 2, i32 3>
%shift3 = shufflevector <4 x float> %b, <4 x float> poison, <4 x i32> <i32 undef, i32 undef, i32 3, i32 undef>
%6 = fadd <4 x float> %shift3, %b
%7 = shufflevector <4 x float> %5, <4 x float> %6, <4 x i32> <i32 6, i32 1, i32 2, i32 3>
ret <4 x float> %7
}
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D111800/new/
https://reviews.llvm.org/D111800
More information about the llvm-commits
mailing list