[llvm-dev] [RFC] Stack overflow and optimizations

Mon Aug 12 14:34:01 PDT 2019

Am Sa., 10. Aug. 2019 um 11:54 Uhr schrieb Ralf Jung <post at ralfj.de>:
> Please keep in mind non-C/C++ frontends. For example, in Rust, we promise to
> avoid all undefined behavior in safe code. There is no reasonable compositional
> analysis that can statically detect stack overflows (I know safety-critical
> systems are subjected such analyses, but those could not reasonably be enforced
> on all Rust code -- most of them just forbid recursion, for example), so we
> defer to a dynamic check instead: we have a guard page for every thread, and we
> make sure that for big stack frames being pushed, every page covered by that
> frame is touched at least once (so that we will never skip the guard page).
>
> If LLVM considers stack overflows as UB, the guard page is not enough. Dynamic
> checks cannot guard against UB exploited by the compiler. So, in that case it
> would become basically impossible to compile a safe language to LLVM, as far as
> I can see.

I think this is more target-platform specific than input-language. My
motivation was to preserve Windows SEH behavior where one can
(reliably?) catch stack overflows using a __try __except handler. I
think its ABI even specifies that there must be stack probing. The
original version of https://reviews.llvm.org/D59978 would not consider
that such an asynchronous exception could be a cause for jumping to
the unwind handler with the argument that "LLVM does not model
asynchronous exceptions" (cite from
https://clang.llvm.org/docs/MSVCCompatibility.html) anyway.

As Nevin pointed out, other platforms not require stack probing, do
not have Structured Exception Handling, but may call the signal
handler (which we also do not model).

So I wrote this RFC to clarify which guarantees we are giving in case
of stack overflow, especially in situations where the compilation
target does not give such guarantees.

> > First: In some situations, we already assume that stack overflows are
> > unobservable behaviours. For instance, optimizations of stack
> > variables cause the stack frames to be smaller, such that the stack
> > overflow does not occur where it would have in an unoptimized program.
>
> That's fine. Basically, the abstract machine can be specified to
> *non-deterministically* trigger a stack overflow any time the stack grows. So
> essentially the running program has to be prepared that the stack might overflow
> any time, but at the same time cannot rely on *when* it overflows. This is a
> standard "trick" to formally handle resource exhaustion without having to
> specify exactly how much of that resource is available.
>
> This gives LLVM the freedom to grow or shrink stack frames. But this dos *not*
> mean that stack overflows are UB!

What is it then? What reference says it is/is not UB?

> > Also, we optimize the function
> > ```
> >  void overflow() {
> >   overflow();
> > }
> > ```
> >  to
> > ```
> > ; Function Attrs: nounwind readnone sspstrong uwtable
> > define dso_local void @"?overflow@@YAXXZ"() local_unnamed_addr #0 {
> > entry:
> >   ret void
> > }
> > ```
>
> This looks like an instance of the C++ forward progress assumption?
> (C AFAIK only has that assumption for [some] loops, not recursive functions, so
> arguably this is a miscompilation for C.)

If stack overflow is UB, then this is a valid optimization.
If the compiler is allowed to optimize the stack frame size to zero
(=> tail call), we would effectively get an infinite loop, also a
valid outcome of UB.

If it is not UB, what is the defined behaviour?

If this is an instance of the forward-progress assumption, we should
maybe discuss it here:
https://reviews.llvm.org/D65718

Should we have a similar document clarifying the behavior of stack overflows?

Michael