[LLVMdev] Question about NoWrap flag for SCEVAddRecExpr
Gerolf Hoflehner
ghoflehner at apple.com
Wed Jun 10 16:29:05 PDT 2015
Hi Sanjoy,
Interesting. I took a closer look at SCEV when I saw Tong’s example. I’m curious about "[1]: It will try to prove nuw and nsw in cases where it is "easy", but not in this specific case.” Could you describe “easy” and “difficult” algorithmically?
Thanks
Gerolf
> On Jun 10, 2015, at 1:29 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:
>
> [+CC Andy]
>
>> Can anyone familiar with ScalarRevolution tell me whether this is an
>> expected behavior or a bug?
>
> Assuming you're talking about 2*k, this is a bug. ScalarEvolution
> should be able to prove that {0,+,4} is <nsw> and <nuw>.
>
> Part of the problem is that in this case ScalarEvolution does not try
> to prove that {0,+,4} is <nsw> when the expression is constructed
> (since proving that has non-trivial cost) [1]. To get ScalarEvolution
> to try to prove that {0,+,4} has no-wrap properties, the client needs
> to construct a sign-extend expression of {0,+,4}. SCEV will try to
> change a sext of an add-rec to an add-rec of sexts and try to prove
> no-wrap in the process [2].
>
> To easily do this from IR, you can just add a dummy sext instruction
> (like in the IR fragment below) and run
> 'opt -analyze -scalar-evolution -scalar-evolution' (just running SCEV
> won't dce the unused instruction):
>
> ; ModuleID = '<stdin>'
> target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
> target triple = "x86_64-apple-macosx10.10.0"
>
> @x = common global [1024 x float] zeroinitializer, align 16
> @y = common global [1024 x float] zeroinitializer, align 16
>
> ; Function Attrs: nounwind ssp uwtable
> define void @myloop1() {
> bb:
> br label %bb2
>
> bb1: ; preds = %bb2
> ret void
>
> bb2: ; preds = %bb2, %bb
> %k.01 = phi i64 [ 0, %bb ], [ %tmp15, %bb2 ]
> %tmp = shl nsw i64 %k.01, 1
> %dummy_sext = sext i64 %tmp to i128
> %tmp3 = getelementptr inbounds [1024 x float], [1024 x float]* @x,
> i64 0, i64 %tmp
> %tmp4 = load float, float* %tmp3, align 16, !tbaa !2
> %tmp5 = getelementptr inbounds [1024 x float], [1024 x float]* @y,
> i64 0, i64 %k.01
> %tmp6 = load float, float* %tmp5, align 8, !tbaa !2
> %tmp7 = fadd float %tmp4, %tmp6
> store float %tmp7, float* %tmp3, align 16, !tbaa !2
> %tmp8 = or i64 %k.01, 1
> %tmp9 = shl nsw i64 %tmp8, 1
> %tmp10 = getelementptr inbounds [1024 x float], [1024 x float]*
> @x, i64 0, i64 %tmp9
> %tmp11 = load float, float* %tmp10, align 8, !tbaa !2
> %tmp12 = getelementptr inbounds [1024 x float], [1024 x float]*
> @y, i64 0, i64 %tmp8
> %tmp13 = load float, float* %tmp12, align 4, !tbaa !2
> %tmp14 = fadd float %tmp11, %tmp13
> store float %tmp14, float* %tmp10, align 8, !tbaa !2
> %tmp15 = add nsw i64 %k.01, 2
> %exitcond.1 = icmp eq i64 %tmp15, 512
> br i1 %exitcond.1, label %bb1, label %bb2
> }
>
> !0 = !{i32 1, !"PIC Level", i32 2}
> !1 = !{!"clang version 3.7.0 (clang-stage2-configure-Rlto_build 239114)"}
> !2 = !{!3, !3, i64 0}
> !3 = !{!"float", !4, i64 0}
> !4 = !{!"omnipotent char", !5, i64 0}
> !5 = !{!"Simple C/C++ TBAA"}
>
>
> However, adding a dummy sext only proves <nuw> for {0,+,4} and not
> <nsw>. The problem is that when constructing sext({0,+,4}) SCEV
> realizes that since {0,+,4} is always positive, sext({0,+,4}) ==
> zext({0,+,4}); and to change a zext of an add-rec to an add-rec of
> zexts, SCEV only needs to prove <nuw> and not <nsw>.
>
>
> I wonder if it makes sense to add a hook to SCEV that gets it to try
> as hard as it can to prove <nuw> / <nsw> for specific add recurrences.
>
>
>
> [1]: It will try to prove nuw and nsw in cases where it is "easy", but
> not in this specific case.
>
> [2]: So a worthwhile project is to have the vectorizer construct sign
> extend expressions of add recurrences that it really cares about
> proving no-wrap of and then check the flags on the
> SCEVAddRecExpr. It may consume too much compile time, so there's
> a tricky trade-off here.
>
>
> -- Sanjoy
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
More information about the llvm-dev
mailing list