[LLVMdev] readonly and infinite loops

Wed Jul 1 15:15:31 PDT 2015

On 30 June 2015 at 14:30, Nuno Lopes <nunoplopes at sapo.pt> wrote:

> Interesting.  Could you give an example why knowing a function will
>>> halt is
>>> essential for the heap-to-stack conversion?
>>>
>>
>> The key situation you need to establish in order to do heap-to-stack
>> conversion, is that you can see the calls to free (or realloc, etc.) along
>> all control-flow paths. If you can, then you can perform the conversion
>> (because any additional calls to free that you can't observe would lead to
>> a double free, and thus undefined behavior). Thus, if we have this
>> situation:
>>
>> void bar(int *a);
>> void foo() {
>>   int *a = (int*) malloc(sizeof(int)*40);
>>   bar(a);
>>   free(a);
>> }
>>
>> we can perform heap-to-stack conversion iff we know that bar(int *)
>> always returns normally. If it never returns (perhaps by looping
>> indefinitely) then it might capture the pointer, pass it off to some other
>> thread, and that other thread might call free() (or it might just call
>> free() itself before looping indefinitely). In short, we need to know
>> whether the call to free() after the call to bar() is dead. If we know that
>> it is reached, then we can perform heap-to-stack conversion. Also worth
>> noting is that because we unconditionally free(a) after the call to bar(a),
>> it would not be legal for bar(a) to call realloc on a (because if realloc
>> did reallocate the buffer we'd end up freeing it twice when bar(a) did
>> eventually return).
>>
>
> I see, thanks!
> Your argument is that knowing that bar returns implies that 'a' cannot be
> captured or reallocated, otherwise it would be UB.  Makes sense, yes.
>

I'm afraid it's worse than that.

void caller() {
  int *ptr = malloc(sizeof(int));
  callee(ptr);
  free(ptr);
}

void callee(int *ptr) {
  if (...) {
    free(ptr);
    log("get critical log message out to the humans, maybe my final message
will be the last hint they need to finally resolve the bugs inside myself");
  }
}

C and C++ do permit undefined behaviour to have interactions backwards in
time. In essence, knowing that UB must happen later can cause impossible
things to happen now. However, LLVM offers an implementation-defined
guarantee that no UB occurs until the earliest instruction where it occurs.

Your reasoning that we can't have a free in the callee because we'd hit a
double-free (and hence UB) in the caller is not sufficient to perform
heap-to-stack transform with that guarantee, because it will move the UB
sooner to the free() in the callee. That violates our implementation
guarantee that the log message will be emitted (because we'll have already
entered UB-land).

The alternatives are to define double-free as not entering full UB (similar
to poison, but we also have problems to solve with load, store, call,
branch on value computed through signed overflow, etc.), or to remove the
guarantee that the log message will be emitted.

Nick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150701/c376590e/attachment.html>