[PATCH] D40645: [SCEV][NFC] Check NoWrap flags before lexicographical comparison of SCEVs
Maxim Kazantsev via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 16 23:35:36 PDT 2018
UPD: I've looked into the code and we seem to do it only for AddRecs. I don't mind if you revert the patch with checking in your test.
From: Maxim Kazantsev
Sent: Friday, August 17, 2018 1:32 PM
To: 'rtereshin at apple.com' <rtereshin at apple.com>
Cc: sanjoy at playingwithpointers.com; listmail at philipreames.com; javed.absar at arm.com; llvm-commits <llvm-commits at lists.llvm.org>; junbuml at codeaurora.org; wmi at google.com; marcello.maggioni at gmail.com; Justin Bogner <jbogner at apple.com>
Subject: RE: [PATCH] D40645: [SCEV][NFC] Check NoWrap flags before lexicographical comparison of SCEVs
Hi Roman,
Thanks for finding this issue. I think you are right, and the fact that we may end up assigining NoWrap flags after the SCEV was created is extremely annoying by many means. I think that the underlying problems against which it was made could've been fixed differently. We've put a lot of efforts to avoid creation of huge SCEVs.
As far as I'm aware, we only set NoWraps flag lately for AddRecs and never for Adds/Muls. If it is so, it still makes sense to do this to save some compile time for adds/muls. Please feel free to either prohibit this for AddRecs or just revert this patch with adding a TODO that it still is a valid thing for add/muls and checking in your example.
Thanks,
Max
From: rtereshin at apple.com<mailto:rtereshin at apple.com> [mailto:rtereshin at apple.com]
Sent: Friday, August 17, 2018 1:22 PM
To: Maxim Kazantsev <max.kazantsev at azul.com<mailto:max.kazantsev at azul.com>>
Cc: sanjoy at playingwithpointers.com<mailto:sanjoy at playingwithpointers.com>; listmail at philipreames.com<mailto:listmail at philipreames.com>; javed.absar at arm.com<mailto:javed.absar at arm.com>; llvm-commits <llvm-commits at lists.llvm.org<mailto:llvm-commits at lists.llvm.org>>; junbuml at codeaurora.org<mailto:junbuml at codeaurora.org>; wmi at google.com<mailto:wmi at google.com>; marcello.maggioni at gmail.com<mailto:marcello.maggioni at gmail.com>; Justin Bogner <jbogner at apple.com<mailto:jbogner at apple.com>>
Subject: Re: [PATCH] D40645: [SCEV][NFC] Check NoWrap flags before lexicographical comparison of SCEVs
Hi all,
I just hit a real-world scenario where the problem I've described in the previous comment actually occurs and causes 2 identical SCEV expressions to have different order of operands and therefore compare not equal, which in its turn prevents LoadStoreVectorizer from vectorizing a pair of consecutive loads.
With the patch reverted the issue goes away.
Please run
$ opt -load-store-vectorizer reproducer.ll -S -o -
to see the difference in vectorization.
If the following patch is also applied:
diff --git a/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp b/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
index e2ee87b2086..74be4d4c87b 100644
--- a/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
+++ b/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
@@ -445,6 +445,12 @@ bool Vectorizer::lookThroughComplexAddresses(Value *PtrA, Value *PtrB,
const SCEV *OffsetSCEVB = SE.getSCEV(OpB);
const SCEV *C = SE.getConstant(IdxDiff.trunc(BitWidth));
const SCEV *X = SE.getAddExpr(OffsetSCEVA, C);
+ if (OpA->getName() == "preheader.address0.idx" &&
+ OpB->getName() == "common.address.idx") {
+ OffsetSCEVA->dump();
+ X->dump();
+ OffsetSCEVB->dump();
+ }
return X == OffsetSCEVB;
}
you will see the following output as well:
(((56 + (zext i2 (trunc i32 %tmp2 to i2) to i32) + (4 * (zext i1 (trunc i32 (%tmp2 /u 4) to i1) to i32))<nuw><nsw>) * %tmp) + (4 * (zext i1 (trunc i32 (%tmp2 /u 2) to i1) to i32))<nuw><nsw>)
(1 + ((56 + (zext i2 (trunc i32 %tmp2 to i2) to i32) + (4 * (zext i1 (trunc i32 (%tmp2 /u 4) to i1) to i32))<nuw><nsw>) * %tmp) + (4 * (zext i1 (trunc i32 (%tmp2 /u 2) to i1) to i32))<nuw><nsw>)
(1 + (4 * (zext i1 (trunc i32 (%tmp2 /u 2) to i1) to i32))<nuw><nsw> + ((56 + (zext i2 (trunc i32 %tmp2 to i2) to i32) + (4 * (zext i1 (trunc i32 (%tmp2 /u 4) to i1) to i32))<nuw><nsw>) * %tmp))
As you can see, second and third SCEVAddExpr's are the same but sorted differently.
On a larger example (the source of the reproducer.ll attached, which is a bugpoint) I have seen even weirder behavior: adding a constant to an existing SCEV changes the order of the existing terms,
for instance, getAddExpr(1, ((A * B) + (C * D))) returns (1 + (C * D) + (A * B)).
Thanks,
Roman
target triple = "x86_64--"
@global_value0 = external constant i32
@global_value1 = external constant i32
@other_value = external global float
@a = external global float
@b = external global float
@c = external global float
@d = external global float
@plus1 = external global i32
@cnd = external global i8
; Function Attrs: nounwind
define void @main() local_unnamed_addr #0 {
entry:
%tmp = load i32, i32* @global_value0, !range !0
%tmp2 = load i32, i32* @global_value1
%and.i.i = and i32 %tmp2, 2
%add.nuw.nsw.i.i = add nuw nsw i32 %and.i.i, 0
%mul.i.i = shl nuw nsw i32 %add.nuw.nsw.i.i, 1
%and6.i.i = and i32 %tmp2, 3
%and9.i.i = and i32 %tmp2, 4
%add.nuw.nsw10.i.i = add nuw nsw i32 %and6.i.i, %and9.i.i
%conv3.i42.i = add nuw nsw i32 %mul.i.i, 1
%reass.add346.7 = add nuw nsw i32 %add.nuw.nsw10.i.i, 56
%reass.mul347.7 = mul nuw nsw i32 %tmp, %reass.add346.7
%add7.i.7 = add nuw nsw i32 %reass.mul347.7, 0
%preheader.address0.idx = add nuw nsw i32 %add7.i.7, %mul.i.i
%preheader.address0.idx.zext = zext i32 %preheader.address0.idx to i64
%preheader.load0.address = getelementptr inbounds float, float* @other_value, i64 %preheader.address0.idx.zext
%preheader.load0. = load float, float* %preheader.load0.address, align 4, !tbaa !1
%common.address.idx = add nuw nsw i32 %add7.i.7, %conv3.i42.i
%preheader.header.common.address.idx.zext = zext i32 %common.address.idx to i64
%preheader.load1.address = getelementptr inbounds float, float* @other_value, i64 %preheader.header.common.address.idx.zext
%preheader.load1. = load float, float* %preheader.load1.address, align 4, !tbaa !1
br label %for.body23
for.body23: ; preds = %for.body23, %entry
%loop.header.load0.address = getelementptr inbounds float, float* @other_value, i64 %preheader.header.common.address.idx.zext
%loop.header.load0. = load float, float* %loop.header.load0.address, align 4, !tbaa !1
%reass.mul343.7 = mul nuw nsw i32 %reass.add346.7, 72
%add7.i286.7.7 = add nuw nsw i32 %reass.mul343.7, 56
%add9.i288.7.7 = add nuw nsw i32 %add7.i286.7.7, %mul.i.i
%loop.header.address1.idx = add nuw nsw i32 %add9.i288.7.7, 1
%loop.header.address1.idx.zext = zext i32 %loop.header.address1.idx to i64
%loop.header.load1.address = getelementptr inbounds float, float* @other_value, i64 %loop.header.address1.idx.zext
%loop.header.load1. = load float, float* %loop.header.load1.address, align 4, !tbaa !1
store float %preheader.load0., float* @a, align 4, !tbaa !1
store float %preheader.load1., float* @b, align 4, !tbaa !1
store float %loop.header.load0., float* @c, align 4, !tbaa !1
store float %loop.header.load1., float* @d, align 4, !tbaa !1
%loaded.cnd = load i8, i8* @cnd
%condition = trunc i8 %loaded.cnd to i1
br i1 %condition, label %for.body23, label %exit
exit:
ret void
}
attributes #0 = { nounwind }
!0 = !{i32 0, i32 65536}
!1 = !{!2, !2, i64 0}
!2 = !{!"float", !3, i64 0}
!3 = !{!"omnipotent char", !4, i64 0}
!4 = !{!"Simple C++ TBAA"}
> On Aug 10, 2018, at 5:13 PM, Roman Tereshin via Phabricator <reviews at reviews.llvm.org<mailto:reviews at reviews.llvm.org>> wrote:
>
> rtereshin added inline comments.
> Herald added a subscriber: javed.absar.
>
>
> ================
> Comment at: llvm/trunk/lib/Analysis/ScalarEvolution.cpp:699
> + if (LA->getNoWrapFlags() != RA->getNoWrapFlags())
> + return (int)LA->getNoWrapFlags() - (int)RA->getNoWrapFlags();
> +
> ----------------
> Hi everyone,
>
> Unfortunately, wrapping flags are not a part of SCEV's identity (they do not participate in computing a hash value or in equality comparisons) and in fact they could be assigned after the fact w/o rebuilding a SCEV.
>
> At least, last time I checked it was that way. Grep for const_cast's to see quite a few of example, AFAIR all for AddRec's.
>
> So, if 2 expressions get built in 2 slightly different ways: one with flags set in the beginning, the other with the flags attached later on, we may end up with 2 expressions which are exactly the same but have their operands swapped in one of the commutative N-ary expressions, and at least one of them will have "sorted by complexity" invariant broken.
>
> 2 identical SCEV's won't compare equal by pointer comparison as they are supposed to.
>
> I didn't verify this with an experiment yet, please correct me if I'm wrong.
>
> Thanks,
> Roman
>
>
> Repository:
> rL LLVM
>
> https://reviews.llvm.org/D40645
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180817/885784f3/attachment.html>
More information about the llvm-commits
mailing list