[llvm-dev] lifetime.start/end

Sun Jan 24 08:06:44 PST 2021

Hi all,

On 19.01.21 18:37, James Y Knight wrote:
> I haven't followed the whole discussion here, so I'm not sure if I've understood 
> the proposal correctly, but I'm a bit concerned.
> 
> In the IR generated by Clang, lifetime markers are inserted to denote the scope 
> of the variable's storage. As such, it ought to be entirely valid to coalesce 
> the variables x and y to a single address in the following example, /even 
> though/ the addresses both escape and are compared:
> 
> int *p_x, *p_y;
> int test() {
>    {
>      int x = 5;
>      p_x = &x;
>    }
>    {
>      int y = 5;
>      p_y = &y;
>    }
>    return p_x == p_y;
> }
> 
> Although we don't currently appear to do such coalescing, it would seem 
> unfortunate to define the IR so as to prohibit it.
> 
> Furthermore, it's not clear to me that the comparison at the end is actually 
> required to return a consistent answer. The deallocation of x and y magically 
> transforms all pointers to them into "invalid pointer values", the use of which 
> is implementation defined [https://wg21.link/basic.stc 
> <https://wg21.link/basic.stc>].

This is sometimes called "pointer zapping". C and C++ have "pointer zapping", 
but to my knowledge nothing in the LLVM LangRef indicates that LLVM has "pointer 
zapping" (and the LLVM implementation is consistent with that). And indeed, with 
my Rust hat on, I can say that if LLVM would adopt "pointer zapping", that would 
be a *serious* problem -- it would render safe Rust unsafe, in ways that would 
be really hard to fix without doing a breaking language change.

So if LLVM considers making use of "pointer zapping", I think that has to be 
opt-in from the frontend. And it would be a shame if programs like your example, 
when translated to Rust, could not be compiled in the same way (with x and y 
being in the same stack slot):

fn test() -> bool {
   let p_x;
   let p_y;
   { let x = 5; p_x = &x as *const _; }
   { let y = 5; p_y = &y as *const _; }
   p_x == p_y
}

Kind regards,
Ralf

> I think it'd be OK for the implementation to say 
> that deallocation turns all such pointers into indeterminate values -- and, 
> thus, to end up with (p_x == p_y) being false, and yet, for a subsequent 
> `printf("%p %p\n", p_x, p_y);` to print the same value twice.
> 
> On Mon, Jan 18, 2021, 5:05 PM Johannes Doerfert via llvm-dev 
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> 
> 
>     On 1/12/21 4:11 PM, Michael Kruse wrote:
>      > Am Di., 12. Jan. 2021 um 12:33 Uhr schrieb Ralf Jung via llvm-dev
>      > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>:
>      >> I hope this question has not been answered yet, but I don't see how that
>     fold
>      >> could be legal. I asked the same question on Phabricator but it seems
>     you prefer
>      >> to have the discussion here. Taking your example from there and
>     adjusting it:
>      >>
>      >> p = malloc(1)
>      >> q = malloc(1)
>      >> %c = icmp eq %p, %q
>      >> free(q)
>      >> free(p)
>      >>
>      >> I think there is a guarantee that c will always be "false". Every
>     operational
>      >> model of allocation that I have ever seen will guarantee this, and the
>     same for
>      >> every program logic that I can imagine. So if the compiler may fold this to
>      >> "false", then as far as I can see, pointer comparison becomes entirely
>      >> unpredictable. The only consistent model I can think of is "icmp on
>     pointers may
>      >> spuriously return 'true' at any time", which I doubt anyone wants. ;)
>      > In my understanding of
>      > https://timsong-cpp.github.io/cppwp/n3337/expr.rel#2.2
>     <https://timsong-cpp.github.io/cppwp/n3337/expr.rel#2.2> or
>      > http://eel.is/c++draft/expr.rel#4.3 <http://eel.is/c++draft/expr.rel#4.3>
>     the result of this is unspecified.
>      > While this does not necessarily extend to LLVM-IR, LLVM-IR usually
>      > assumes C/C++ semantics unless defined otherwise.
> 
>     I also thought the above is fine.
> 
> 
>      > Optimizations such as https://reviews.llvm.org/D53362
>     <https://reviews.llvm.org/D53362> and
>      > https://reviews.llvm.org/D65408 <https://reviews.llvm.org/D65408> assume
>     the equivalence of alloca and
>      > malloc+free. I assume that such optimizations are the reason for this
>      > unspecifiedness, i.e. I think someone might want this. Johannes's
>      > proposal A1 however, seems to forbid StackColoring to exploit lifetime
>      > markers, which it currently does and I am not sure we can afford to do
>      > that.
> 
>     It's not that A1 "forbid[s] StackColoring to exploit lifetime markers", but
>     A1 says that you cannot use lifetime markers to argue about the address. So
>     you can still use them for reasoning but only as far as they tell you the
>     content is undef. If you want to do coalescing, you need to verify more
>     things,
>     especially that the addresses are not (both) observed. If they are,
>     which they
>     can be with lifetime markers in place as well, you end up with
>     inconsistent views.
> 
>     That said, if we want to preserve the property that you cannot access
>     outside of
>     lifetime ranges, you could "fix" StackColoring by simply verifying one
>     of the two
>     allocas is not escaping. You'd still need to fix InstCombine but the fix
>     is the same.
>     I'm not sure we want to declare accesses outside of lifetime ranges UB
>     or not. I imagine
>     in practice this makes little difference anyway, given that escaping
>     uses are a problem
>     on their own.
> 
>     ~ Johannes
> 
> 
>      > Michael
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>     <https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 

-- 
Website: https://people.mpi-sws.org/~jung/