[PATCH] D60047: [CaptureTracking] Don't let comparisons against null escape inbounds pointers

Sanjoy Das via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 7 16:46:29 PDT 2019


sanjoy added a comment.

In D60047#1494328 <https://reviews.llvm.org/D60047#1494328>, @jdoerfert wrote:

> To get the escape case you have to have guessed the correct offset from the initial pointer value to `null`.
>
> `foo(p_with_offset);` in `void bar_0()` would be UB if you would have chosen any offset not in {0,4} (both bytes). The first because the dereferenceability property is fulfilled, the second because it happens to result in a `null` pointer.


Agreed.

> You basically picked `offset = 0 - ptr` and then you can make it escape through the check. But if you can pick `offset` that way, you do not need the check since you have to know `ptr`.

Maybe we're disagreeing on what it means to "guess" a value.  I don't think transforming:

  void foo(int x) {
    print(x);
  }

to

  void foo(int x) {
    if (x == 42) {
      print(42);
    } else {
      print(x);
    }
  }

counts as "guessing" `x`.  You could call it value profiling plus speculative optimization but that's a standard optimization technique.

> I think that the "one past the end" pointer is `null` in this example is confusing and coincidental, e.g., shift the allocation 4 bytes down to make the valid offsets {0, 8} with everything else the same. The invariant in `foo` (first comment) only holds for the particular value of `ptr` in `bar_0` and can therefore not be exploited.

Maybe another way of thinking about this can be more productive.  If we focus on `bar_0` only:

  int** global_ptr;
  
  void foo(int32* /*deref_or_null(4)*/ ptr) {
    if (ptr == null) {
      int32* ptr_leaked = (int32*)(intptr_t)-4;
      *global_ptr = ptr_leaked;
    }
  }
  
  void bar_0() {
    int32* p = new int32;
    // p happens to numerically be -4 == 2^64-4
    int32* p_with_offset = p + 1; // non-inbounds GEP, evaluates to null
    foo(p_with_offset);
  
    // XXX
  
    print(p);
  }

then under the assumption that at runtime the program prints `2^64-4`.

1. Is the program well defined?
2. Has `p` escaped by the time execution reaches point `XXX`?

If (1) and (2) are true, given this change, will we correctly infer that these facts?
If at least one of (1) and (2) are false, then which one is false and why?
Or is the question ill posed in some way?

One mode of reasoning that could invalidate this example is that `p` *could have been* something other than `2^64-4` which would invoke UB.  This UB would both let us pretend that `foo` does not escape `p`, and also print `2^64-4` even though `p` was not `2^64-4` in the execution we chose.  This seems a bit subtle though, and I'm not entirely sure this is sound reasoning.

> The `bar_1` example just shows that picking an offset which will fulfill the dereferenceable property will not leak information (it cannot be null). Note that `p_with_offset` in `bar_0` does not fulfill the dereferenceability property!

Yes, but it does fulfill the "null" property.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60047/new/

https://reviews.llvm.org/D60047





More information about the llvm-commits mailing list