[PATCH] D105730: [SLP] match logical and/or as reduction candidates
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Jul 10 06:00:22 PDT 2021
spatel added inline comments.
================
Comment at: llvm/test/Transforms/PhaseOrdering/X86/vector-reductions-logical.ll:13
+; CHECK-NEXT: [[TMP2:%.*]] = bitcast <4 x i1> [[TMP1]] to i4
+; CHECK-NEXT: [[TMP3:%.*]] = icmp eq i4 [[TMP2]], -1
+; CHECK-NEXT: br i1 [[TMP3]], label [[COMMON_RET:%.*]], label [[LOR_LHS_FALSE:%.*]]
----------------
RKSimon wrote:
> It doesn't have to be part of this - but should we be trying to fold these patterns to a reduction intrinsic ?
>
> ```
> ; CHECK-NEXT: [[TMP0:%.*]] = fcmp olt <4 x float> [[T:%.*]], zeroinitializer
> ; CHECK-NEXT: [[TMP1:%.*]] = freeze <4 x i1> [[TMP0]]
> ; CHECK-NEXT: [[TMP3:%.*]] = call i1 llvm.vector.reduce.and.v4i1([[TMP1]])
> ```
We are forming a reduction intrinsic in SLP as we can see in the SLP-only tests.
In this case, we have -O2, so a subsequent InstCombine turns it into bitcast+cmp via:
https://github.com/llvm/llvm-project/blob/d919bca87556548555af0a7aa1239ea64ba4f3e8/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp#L1966
Still need to check what (if any) difference that makes for codegen.
================
Comment at: llvm/test/Transforms/PhaseOrdering/X86/vector-reductions-logical.ll:448
+; CHECK-NEXT: [[CMP20:%.*]] = icmp sgt i32 [[TMP0]], 255
+; CHECK-NEXT: [[OR_COND6:%.*]] = select i1 [[TMP10]], i1 true, i1 [[CMP20]]
+; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP3]], [[TMP2]]
----------------
RKSimon wrote:
> any idea why we only match one of the reduction chains?
I haven't stepped through yet. We did make some adjustments for sorting the reduction ops in previous patches, but I doubt that extended to creating multiple reductions and/or re-running analysis after forming a reduction.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D105730/new/
https://reviews.llvm.org/D105730
More information about the llvm-commits
mailing list