[PATCH] D127119: [SLP]Fix undef handling in gather function.
Alexey Bataev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 7 12:47:17 PDT 2022
ABataev added a comment.
In D127119#3564531 <https://reviews.llvm.org/D127119#3564531>, @hvdijk wrote:
> In D127119#3564522 <https://reviews.llvm.org/D127119#3564522>, @ABataev wrote:
>
>> In D127119#3564461 <https://reviews.llvm.org/D127119#3564461>, @hvdijk wrote:
>>
>>> Comparing the changes to the tests in this diff to those in D127073 <https://reviews.llvm.org/D127073>, I am seeing a number of tests where we have more shufflevectors, and none where we have fewer. Are there improvements that are not as obvious to see?
>>
>> Yes, there are. These extra shuffles caused by changes in performExtractsShuffleAction() and in IsIdenticalOrLessDefined lambda, these changes (they treat UndefMaskElem as possible poison) increase number of shuffles. Without them, there are less shuffles, these extra changes are required for correct handling of UndefMaskElem as posion.
>
> That suggests that D127073 <https://reviews.llvm.org/D127073> still results in incorrect code in some cases. I was under the impression that it was already correct, just not optimal. Can you point to specific tests where you believe D127073 <https://reviews.llvm.org/D127073> results in wrong code?
>
> Taking a random example
>
> --- a/llvm/test/Transforms/SLPVectorizer/X86/insert-shuffle.ll
> +++ b/llvm/test/Transforms/SLPVectorizer/X86/insert-shuffle.ll
> @@ -19,9 +19,11 @@ define { <2 x float>, <2 x float> } @foo(%struct.sw* %v) {
> ; CHECK-NEXT: [[TMP8:%.*]] = fadd <4 x float> [[TMP7]], undef
> ; CHECK-NEXT: [[TMP9:%.*]] = fadd <4 x float> [[TMP8]], undef
> ; CHECK-NEXT: [[TMP10:%.*]] = shufflevector <4 x float> [[TMP9]], <4 x float> poison, <2 x i32> <i32 0, i32 1>
> -; CHECK-NEXT: [[TMP11:%.*]] = shufflevector <4 x float> [[TMP9]], <4 x float> poison, <2 x i32> <i32 2, i32 3>
> -; CHECK-NEXT: [[INS1:%.*]] = insertvalue { <2 x float>, <2 x float> } undef, <2 x float> [[TMP10]], 0
> -; CHECK-NEXT: [[INS2:%.*]] = insertvalue { <2 x float>, <2 x float> } [[INS1]], <2 x float> [[TMP11]], 1
> +; CHECK-NEXT: [[TMP11:%.*]] = shufflevector <2 x float> undef, <2 x float> [[TMP10]], <2 x i32> <i32 2, i32 3>
> +; CHECK-NEXT: [[TMP12:%.*]] = shufflevector <4 x float> [[TMP9]], <4 x float> poison, <2 x i32> <i32 2, i32 3>
> +; CHECK-NEXT: [[TMP13:%.*]] = shufflevector <2 x float> undef, <2 x float> [[TMP12]], <2 x i32> <i32 2, i32 3>
> +; CHECK-NEXT: [[INS1:%.*]] = insertvalue { <2 x float>, <2 x float> } undef, <2 x float> [[TMP11]], 0
> +; CHECK-NEXT: [[INS2:%.*]] = insertvalue { <2 x float>, <2 x float> } [[INS1]], <2 x float> [[TMP13]], 1
> ; CHECK-NEXT: ret { <2 x float>, <2 x float> } [[INS2]]
> ;
> entry:
>
> The extra shufflevectors here, TMP11 and TMP13, are identity shuffles that should not have been generated, they do not change the handling of undef/poison.
Yes, I see some extra shuffles, that can be removed safely. Working on the improvements.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D127119/new/
https://reviews.llvm.org/D127119
More information about the llvm-commits
mailing list