[llvm] r369789 - [InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null

Philip Reames via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 26 21:53:31 PDT 2019


Is there any chance <valid_pointer> is a inbounds geps? Without that, I 
can't see the connection to the patch.

Philip

On 8/26/19 5:05 PM, Jordan Rupprecht wrote:
> btw, we're seeing something that looks like miscompiles with this 
> patch (& after the fix in r369795). The code looks pretty well defined 
> (passes asan/msan/etc.).
>
> I'm trying to reduce it into something that actually triggers the bug, 
> but it vaguely looks like:
> struct Foo {
>  bool IsValid() const { return x; }
>  Foo(char *y): x(y) {}
>  char *x;
> }
>
> Foo get() {
>   return some_condition ? Foo(<valid pointer>) : Foo(nullptr);
> }
>
> void doSomething() {
>   Foo foo = get();
>   if (foo.IsValid()) { .. }
> }
>
> What I'm observing is get() go down the branch of some_condition being 
> true and returning Foo(<valid pointer>), but foo.IsValid() is 
> returning false.
>
> Given that context, is there anything you might see is a subtle bug 
> with this patch?
>
> On Fri, Aug 23, 2019 at 10:57 AM Philip Reames via llvm-commits 
> <llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>> wrote:
>
>     Author: reames
>     Date: Fri Aug 23 10:58:58 2019
>     New Revision: 369789
>
>     URL: http://llvm.org/viewvc/llvm-project?rev=369789&view=rev
>     Log:
>     [InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp
>     eq/ne P, null
>
>     This generalizes the isGEPKnownNonNull rule from ValueTracking to
>     apply when we do not know if the base is non-null, and thus need
>     to replace one condition with another.
>
>     The core notion is that since an inbounds GEP can only form null
>     if the base pointer is null and the offset is zero. However, if
>     the offset is non-zero, the the "inbounds" marker makes the result
>     poison. Thus, we're free to ignore the case where the offset is
>     non-zero. Similarly, there's no case under which a non-null base
>     can result in a null result without generating poison.
>
>     Differential Revision: https://reviews.llvm.org/D66608
>
>
>     Added:
>     llvm/trunk/test/Transforms/InstCombine/gep-inbounds-null.ll
>     Modified:
>     llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp
>
>     Modified:
>     llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp
>     URL:
>     http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp?rev=369789&r1=369788&r2=369789&view=diff
>     ==============================================================================
>     --- llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp
>     (original)
>     +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp
>     Fri Aug 23 10:58:58 2019
>     @@ -894,6 +894,27 @@ Instruction *InstCombiner::foldGEPICmp(G
>            Offset = EmitGEPOffset(GEPLHS);
>          return new ICmpInst(ICmpInst::getSignedPredicate(Cond), Offset,
>      Constant::getNullValue(Offset->getType()));
>     +  } else if (GEPLHS->isInBounds() && ICmpInst::isEquality(Cond) &&
>     +             GEPLHS->getType()->isPointerTy() && // TODO: extend
>     to vector geps
>     +             isa<Constant>(RHS) &&
>     cast<Constant>(RHS)->isNullValue() &&
>     +             !NullPointerIsDefined(I.getFunction(),
>     +  RHS->getType()->getPointerAddressSpace())) {
>     +    // For most address spaces, an allocation can't be placed at
>     null, but null
>     +    // itself is treated as a 0 size allocation in the in bounds
>     rules.  Thus,
>     +    // the only valid inbounds address derived from null, is null
>     itself.
>     +    // Thus, we have four cases to consider:
>     +    // 1) Base == nullptr, Offset == 0 -> inbounds, null
>     +    // 2) Base == nullptr, Offset != 0 -> poison as the result is
>     out of bounds
>     +    // 3) Base != nullptr, Offset == (-base) -> poison (crossing
>     allocations)
>     +    // 4) Base != nullptr, Offset != (-base) -> nonnull (and
>     possibly poison)
>     +    //
>     +    // (Note if we're indexing a type of size 0, that simply
>     collapses into one
>     +    //  of the buckets above.)
>     +    //
>     +    // In general, we're allowed to make values less poison (i.e.
>     remove
>     +    //   sources of full UB), so in this case, we just select
>     between the two
>     +    //   non-poison cases (1 and 4 above).
>     +    return new ICmpInst(Cond, GEPLHS->getPointerOperand(), RHS);
>        } else if (GEPOperator *GEPRHS = dyn_cast<GEPOperator>(RHS)) {
>          // If the base pointers are different, but the indices are
>     the same, just
>          // compare the base pointer.
>
>     Added: llvm/trunk/test/Transforms/InstCombine/gep-inbounds-null.ll
>     URL:
>     http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/gep-inbounds-null.ll?rev=369789&view=auto
>     ==============================================================================
>     --- llvm/trunk/test/Transforms/InstCombine/gep-inbounds-null.ll
>     (added)
>     +++ llvm/trunk/test/Transforms/InstCombine/gep-inbounds-null.ll
>     Fri Aug 23 10:58:58 2019
>     @@ -0,0 +1,184 @@
>     +; NOTE: Assertions have been autogenerated by
>     utils/update_test_checks.py
>     +; RUN: opt -S < %s -instcombine | FileCheck %s
>     +
>     +;; Start by showing the results of constant folding (which
>     doesn't use
>     +;; the poison implied by gep for the nonnull cases).
>     +
>     +define i1 @test_ne_constants_null() {
>     +; CHECK-LABEL: @test_ne_constants_null(
>     +; CHECK-NEXT:  entry:
>     +; CHECK-NEXT:    ret i1 false
>     +;
>     +entry:
>     +  %gep = getelementptr inbounds i8, i8* null, i64 0
>     +  %cnd = icmp ne i8* %gep, null
>     +  ret i1 %cnd
>     +}
>     +
>     +define i1 @test_ne_constants_nonnull() {
>     +; CHECK-LABEL: @test_ne_constants_nonnull(
>     +; CHECK-NEXT:  entry:
>     +; CHECK-NEXT:    ret i1 true
>     +;
>     +entry:
>     +  %gep = getelementptr inbounds i8, i8* null, i64 1
>     +  %cnd = icmp ne i8* %gep, null
>     +  ret i1 %cnd
>     +}
>     +
>     +define i1 @test_eq_constants_null() {
>     +; CHECK-LABEL: @test_eq_constants_null(
>     +; CHECK-NEXT:  entry:
>     +; CHECK-NEXT:    ret i1 true
>     +;
>     +entry:
>     +  %gep = getelementptr inbounds i8, i8* null, i64 0
>     +  %cnd = icmp eq i8* %gep, null
>     +  ret i1 %cnd
>     +}
>     +
>     +define i1 @test_eq_constants_nonnull() {
>     +; CHECK-LABEL: @test_eq_constants_nonnull(
>     +; CHECK-NEXT:  entry:
>     +; CHECK-NEXT:    ret i1 false
>     +;
>     +entry:
>     +  %gep = getelementptr inbounds i8, i8* null, i64 1
>     +  %cnd = icmp eq i8* %gep, null
>     +  ret i1 %cnd
>     +}
>     +
>     +;; Then show the results for non-constants.  These use the
>     inbounds provided
>     +;; UB fact to ignore the possible overflow cases.
>     +
>     +define i1 @test_ne(i8* %base, i64 %idx) {
>     +; CHECK-LABEL: @test_ne(
>     +; CHECK-NEXT:  entry:
>     +; CHECK-NEXT:    [[CND:%.*]] = icmp ne i8* [[BASE:%.*]], null
>     +; CHECK-NEXT:    ret i1 [[CND]]
>     +;
>     +entry:
>     +  %gep = getelementptr inbounds i8, i8* %base, i64 %idx
>     +  %cnd = icmp ne i8* %gep, null
>     +  ret i1 %cnd
>     +}
>     +
>     +define i1 @test_eq(i8* %base, i64 %idx) {
>     +; CHECK-LABEL: @test_eq(
>     +; CHECK-NEXT:  entry:
>     +; CHECK-NEXT:    [[CND:%.*]] = icmp eq i8* [[BASE:%.*]], null
>     +; CHECK-NEXT:    ret i1 [[CND]]
>     +;
>     +entry:
>     +  %gep = getelementptr inbounds i8, i8* %base, i64 %idx
>     +  %cnd = icmp eq i8* %gep, null
>     +  ret i1 %cnd
>     +}
>     +
>     +;; TODO: vectors not yet handled
>     +define <2 x i1> @test_vector_base(<2 x i8*> %base, i64 %idx) {
>     +; CHECK-LABEL: @test_vector_base(
>     +; CHECK-NEXT:  entry:
>     +; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds i8, <2 x
>     i8*> [[BASE:%.*]], i64 [[IDX:%.*]]
>     +; CHECK-NEXT:    [[CND:%.*]] = icmp eq <2 x i8*> [[GEP]],
>     zeroinitializer
>     +; CHECK-NEXT:    ret <2 x i1> [[CND]]
>     +;
>     +entry:
>     +  %gep = getelementptr inbounds i8, <2 x i8*> %base, i64 %idx
>     +  %cnd = icmp eq <2 x i8*> %gep, zeroinitializer
>     +  ret <2 x i1> %cnd
>     +}
>     +
>     +define <2 x i1> @test_vector_index(i8* %base, <2 x i64> %idx) {
>     +; CHECK-LABEL: @test_vector_index(
>     +; CHECK-NEXT:  entry:
>     +; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds i8, i8*
>     [[BASE:%.*]], <2 x i64> [[IDX:%.*]]
>     +; CHECK-NEXT:    [[CND:%.*]] = icmp eq <2 x i8*> [[GEP]],
>     zeroinitializer
>     +; CHECK-NEXT:    ret <2 x i1> [[CND]]
>     +;
>     +entry:
>     +  %gep = getelementptr inbounds i8, i8* %base, <2 x i64> %idx
>     +  %cnd = icmp eq <2 x i8*> %gep, zeroinitializer
>     +  ret <2 x i1> %cnd
>     +}
>     +
>     +;; These two show instsimplify's reasoning getting to the
>     non-zero offsets
>     +;; before instcombine does.
>     +
>     +define i1 @test_eq_pos_idx(i8* %base) {
>     +; CHECK-LABEL: @test_eq_pos_idx(
>     +; CHECK-NEXT:  entry:
>     +; CHECK-NEXT:    ret i1 false
>     +;
>     +entry:
>     +  %gep = getelementptr inbounds i8, i8* %base, i64 1
>     +  %cnd = icmp eq i8* %gep, null
>     +  ret i1 %cnd
>     +}
>     +
>     +define i1 @test_eq_neg_idx(i8* %base) {
>     +; CHECK-LABEL: @test_eq_neg_idx(
>     +; CHECK-NEXT:  entry:
>     +; CHECK-NEXT:    ret i1 false
>     +;
>     +entry:
>     +  %gep = getelementptr inbounds i8, i8* %base, i64 -1
>     +  %cnd = icmp eq i8* %gep, null
>     +  ret i1 %cnd
>     +}
>     +
>     +;; Show an example with a zero sized type since that's
>     +;; a cornercase which keeps getting mentioned.  The GEP
>     +;; produces %base regardless of the value of the index
>     +;; expression.
>     +define i1 @test_size0({}* %base, i64 %idx) {
>     +; CHECK-LABEL: @test_size0(
>     +; CHECK-NEXT:  entry:
>     +; CHECK-NEXT:    [[CND:%.*]] = icmp ne {}* [[BASE:%.*]], null
>     +; CHECK-NEXT:    ret i1 [[CND]]
>     +;
>     +entry:
>     +  %gep = getelementptr inbounds {}, {}* %base, i64 %idx
>     +  %cnd = icmp ne {}* %gep, null
>     +  ret i1 %cnd
>     +}
>     +define i1 @test_size0_nonzero_offset({}* %base) {
>     +; CHECK-LABEL: @test_size0_nonzero_offset(
>     +; CHECK-NEXT:  entry:
>     +; CHECK-NEXT:    [[CND:%.*]] = icmp ne {}* [[BASE:%.*]], null
>     +; CHECK-NEXT:    ret i1 [[CND]]
>     +;
>     +entry:
>     +  %gep = getelementptr inbounds {}, {}* %base, i64 15
>     +  %cnd = icmp ne {}* %gep, null
>     +  ret i1 %cnd
>     +}
>     +
>     +
>     +;; Finally, some negative tests for sanity checking.
>     +
>     +define i1 @neq_noinbounds(i8* %base, i64 %idx) {
>     +; CHECK-LABEL: @neq_noinbounds(
>     +; CHECK-NEXT:  entry:
>     +; CHECK-NEXT:    [[GEP:%.*]] = getelementptr i8, i8*
>     [[BASE:%.*]], i64 [[IDX:%.*]]
>     +; CHECK-NEXT:    [[CND:%.*]] = icmp ne i8* [[GEP]], null
>     +; CHECK-NEXT:    ret i1 [[CND]]
>     +;
>     +entry:
>     +  %gep = getelementptr i8, i8* %base, i64 %idx
>     +  %cnd = icmp ne i8* %gep, null
>     +  ret i1 %cnd
>     +}
>     +
>     +define i1 @neg_objectatnull(i8 addrspace(2)* %base, i64 %idx) {
>     +; CHECK-LABEL: @neg_objectatnull(
>     +; CHECK-NEXT:  entry:
>     +; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds i8, i8
>     addrspace(2)* [[BASE:%.*]], i64 [[IDX:%.*]]
>     +; CHECK-NEXT:    [[CND:%.*]] = icmp eq i8 addrspace(2)* [[GEP]], null
>     +; CHECK-NEXT:    ret i1 [[CND]]
>     +;
>     +entry:
>     +  %gep = getelementptr inbounds i8, i8 addrspace(2)* %base, i64 %idx
>     +  %cnd = icmp eq i8 addrspace(2)* %gep, null
>     +  ret i1 %cnd
>     +}
>
>
>     _______________________________________________
>     llvm-commits mailing list
>     llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>
>     https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190826/e80abfb6/attachment.html>


More information about the llvm-commits mailing list