[PATCH] D60047: [CaptureTracking] Don't let comparisons against null escape inbounds pointers

Johannes Doerfert via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 7 18:16:00 PDT 2019


jdoerfert added a comment.

In D60047#1494356 <https://reviews.llvm.org/D60047#1494356>, @sanjoy wrote:

> In D60047#1494328 <https://reviews.llvm.org/D60047#1494328>, @jdoerfert wrote:
>
> > To get the escape case you have to have guessed the correct offset from the initial pointer value to `null`.
> >
> > `foo(p_with_offset);` in `void bar_0()` would be UB if you would have chosen any offset not in {0,4} (both bytes). The first because the dereferenceability property is fulfilled, the second because it happens to result in a `null` pointer.
>
>
> Agreed.
>
> > You basically picked `offset = 0 - ptr` and then you can make it escape through the check. But if you can pick `offset` that way, you do not need the check since you have to know `ptr`.
>
> Maybe we're disagreeing on what it means to "guess" a value.  I don't think transforming:
>
>   void foo(int x) {
>     print(x);
>   }
>
>
> to
>
>   void foo(int x) {
>     if (x == 42) {
>       print(42);
>     } else {
>       print(x);
>     }
>   }
>
>
> counts as "guessing" `x`.  You could call it value profiling plus speculative optimization but that's a standard optimization technique.


This is for sure OK, no questions asked. You can even compare it against any other value (constant or not) and specialize afterwards.
For our discussion, only the comparison against `null` is interesting though.

>> I think that the "one past the end" pointer is `null` in this example is confusing and coincidental, e.g., shift the allocation 4 bytes down to make the valid offsets {0, 8} with everything else the same. The invariant in `foo` (first comment) only holds for the particular value of `ptr` in `bar_0` and can therefore not be exploited.
> 
> Maybe another way of thinking about this can be more productive.  If we focus on `bar_0` only:
> 
>   int** global_ptr;
>   
>   void foo(int32* /*deref_or_null(4)*/ ptr) {
>     if (ptr == null) {
>       int32* ptr_leaked = (int32*)(intptr_t)-4;
>       *global_ptr = ptr_leaked;
>     }
>   }
>   
>   void bar_0() {
>     int32* p = new int32;
>     // p happens to numerically be -4 == 2^64-4
>     int32* p_with_offset = p + 1; // non-inbounds GEP, evaluates to null
>     foo(p_with_offset);
>   
>     // XXX
>   
>     print(p);
>   }
> 
> 
> then under the assumption that at runtime the program prints `2^64-4`.
> 
> 1. Is the program well defined?
> 2. Has `p` escaped by the time execution reaches point `XXX`?

Under the assumption, I'd say yes and yes.

> If (1) and (2) are true, given this change, will we correctly infer that these facts?
>  If at least one of (1) and (2) are false, then which one is false and why?
>  Or is the question ill posed in some way?

The problem is the assumption. It encodes the offset from `null` (here 4) that makes the program well defined. (The same offset as hard coded in the program.)
If we start without the assumption, the answers change to:

1. No, except if `p` happens to be ` 2^64-4`.
2. No, because we could simply use `2^64-4` as a "copy" of `p` at `XXX` with exactly the same chance of having a conforming program (run). Thus running it and letting the "escape" happen through the `null` comparison is no different from "guessing" the correct offset from `p` to `null`. (I think I do a really bad job at explaining what I mean...)



> One mode of reasoning that could invalidate this example is that `p` *could have been* something other than `2^64-4` which would invoke UB.  This UB would both let us pretend that `foo` does not escape `p`, and also print `2^64-4` even though `p` was not `2^64-4` in the execution we chose.  This seems a bit subtle though, and I'm not entirely sure this is sound reasoning.

I don't think so, `p` has a value depending on which UB is caused or not.

>> The `bar_1` example just shows that picking an offset which will fulfill the dereferenceable property will not leak information (it cannot be null). Note that `p_with_offset` in `bar_0` does not fulfill the dereferenceability property!
> 
> Yes, but it does fulfill the "null" property.

Agreed, but again, only because the hardcoded offset (4bytes) matches the runtime offset from `0`.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60047/new/

https://reviews.llvm.org/D60047





More information about the llvm-commits mailing list