[LLVMdev] [RFC] Defining Infinite Loops

Thu Jul 16 01:52:09 PDT 2015

On Wed, Jul 15, 2015 at 11:00 PM, Chandler Carruth <chandlerc at google.com> wrote:
> FWIW, I'm very much in favor of having a firm and clear answer to these
> questions.
>
> I also agree that it is an absolute requirement that LLVM have *some*
> mechanism for supporting both languages with defined behavior for infinite
> loops and a language requirement that all loops terminate.
>
> However, I'd like to float an alternative approach. I've not spent a lot of
> time thinking about it, so I'm not sure its actually better. I'm wondering
> if you've already thought about it.
>
> What if we have an @llvm.noop.sideeffect() or some such which doesn't read
> or write memory in any way, but which a frontend can place inside a loop
> body to mark that its execution (perhaps infinitely) is in-and-of-itself a
> side effect of the program. We could then teach loop unrolling or the few
> other things that would care to collapse multiple ones into a single one,
> and not count them towards anything.
>
> I know an intrinsic is kind of awkward for this, but it seems like the least
> bad way we have to annotate a loop in a fully generic way. I'd somewhat like
> this to be a property of the *loop* and not of the function. And it really
> needs to be truly generic, handling unnatural CFGs and even recursion-loops.
> My only idea for how to accomplish that is an intrinsic to mark the dynamic
> path which if executed infinitely can't be eliminated.

I read this as: "the optimizer may legally remove only a finite number
of @llvm.noop.sideeffect calls from the program trace".  It gets
really interesting here since for typical managed language
implementations, application threads have to "poll for safepoints"
(check if there is a pending request to halt from the GC).  As of now
we insert these polls late (after most of the interesting
target-independent optimizations have run), but if we were to change
our strategy to insert these polls early and wanted LLVM to optimize
these, the optimization rules would be similar: "the optimizer may not
introduce an unbounded wait between two polls (i.e. it is okay to
remove polls for a provably finite loop, but an infinite loop must
keep them)" [1].  So a @llvm.noop.sideeffect-like concept may be
generally useful in LLVM.

However, I don't think we can get away with just @llvm.noop.sideeffect
-- a function level attribute is required to know when it is okay to
DCE calls to readnone/readonly functions.

> As for why I'm particularly interested in this being a property of the loop,
> consider if you were to have a mixture of Java and C++ code, all compiled to
> LLVM. How do you inline between them?

TBQH, I'd be okay losing some performance in this case.  And we can
always be aggressive about upgrading non-productive [2] functions to
productive if they only call productive functions and have no infinite
loops.

[1]: there are quality of implementation issues on how frequently an
application thread checks for a pending safepoint request

[2]: using the definition for "productive" == "this function produces some
output (volatile load/store, IO, etc.) in a finite amount of time"

-- Sanjoy