[LLVMdev] [RFC] Defining Infinite Loops

Thu Jul 16 11:08:51 PDT 2015

----- Original Message -----
> From: "Chandler Carruth" <chandlerc at google.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "LLVM Dev" <llvmdev at cs.uiuc.edu>
> Sent: Thursday, July 16, 2015 2:33:21 AM
> Subject: Re: [LLVMdev] [RFC] Defining Infinite Loops
> 
> 
> 
> 
> On Thu, Jul 16, 2015 at 12:27 AM Hal Finkel < hfinkel at anl.gov >
> wrote:
> 
> 
> ----- Original Message -----
> > From: "Chandler Carruth" < chandlerc at google.com >
> > To: "Hal Finkel" < hfinkel at anl.gov >, "LLVM Dev" <
> > llvmdev at cs.uiuc.edu >
> > Sent: Thursday, July 16, 2015 1:00:05 AM
> > Subject: Re: [LLVMdev] [RFC] Defining Infinite Loops
> > 
> > 
> > FWIW, I'm very much in favor of having a firm and clear answer to
> > these questions.
> > 
> > I also agree that it is an absolute requirement that LLVM have
> > *some*
> > mechanism for supporting both languages with defined behavior for
> > infinite loops and a language requirement that all loops terminate.
> > 
> > 
> > However, I'd like to float an alternative approach. I've not spent
> > a
> > lot of time thinking about it, so I'm not sure its actually better.
> > I'm wondering if you've already thought about it.
> > 
> > 
> > What if we have an @llvm.noop.sideeffect() or some such which
> > doesn't
> > read or write memory in any way, but which a frontend can place
> > inside a loop body to mark that its execution (perhaps infinitely)
> > is in-and-of-itself a side effect of the program. We could then
> > teach loop unrolling or the few other things that would care to
> > collapse multiple ones into a single one, and not count them
> > towards
> > anything.
> > 
> > 
> > I know an intrinsic is kind of awkward for this, but it seems like
> > the least bad way we have to annotate a loop in a fully generic
> > way.
> > I'd somewhat like this to be a property of the *loop* and not of
> > the
> > function. And it really needs to be truly generic, handling
> > unnatural CFGs and even recursion-loops. My only idea for how to
> > accomplish that is an intrinsic to mark the dynamic path which if
> > executed infinitely can't be eliminated.
> 
> My largest concern is that the frontend would need to add these
> things all over the place, not just before the loop backedges. For
> one thing, if the language has gotos, where should they be inserted?
> 
> 
> The target of every goto.
> 
> 
> For computed goto, very label whose address is taken.
> 
> 
> This at least doesn't seem that bad to me.
> 
> 
> Before every branch will be conservatively correct, but I'm worried
> that will unnecessarily block optimizations. They'd also be needed
> at the entry to every function.
> 
> 
> Only external, address taken, or internal-and-recursively-called
> functions. All of which we already have some blockers to
> optimization, so this part i'm not worried about.
> 
> 
> On the other hand, maybe if we added an optimization that removed
> these things along any control-flow path that already had any other
> side effect, it might not be too bad?
> 
> 
> 
> Absolutely, much like lower-expect, we'd need to make sure that easy
> cases were folded quickly in the optimizer so this didn't get out of
> hand.
> 
> 
> 
> > 
> > 
> > As for why I'm particularly interested in this being a property of
> > the loop, consider if you were to have a mixture of Java and C++
> > code, all compiled to LLVM. How do you inline between them?
> > 
> 
> You add the attribute to the caller.
> 
> 
> This has the really unfortunate side effect of pessimizing code
> during cross language optimizations.
> 
> 
> FWIW, I suspect I might care a lot about this particular case
> (because I believe that Fortran has defined behavior for infinite
> loops).
> 
> 
> 
> Yea, you could argue that C does too, which is one reason why I'm so
> interested in this being done really well even in an LTO situation.
> 
> 
> I think it would be really useful to not have this cross between
> adjacent loops after inlining when they come from different source
> languages, and it would be nice for it to not apply to nested loops
> when those nested loops were inlined from a language without this
> guarantee.
> 
> 
> But I'm still not convinced that the noise of the intrinsic is
> *definitely* worth it. I come from the background of the C++ rules'
> rationale, and so I naturally see the languages that define this as
> giving up optimizations and so wanting to localize the impact of
> that... Not sure that's actually the right perspective though. ;]
> 

I'm leaning toward agreeing with you, primarily because I think it will more-naturally fit into the optimizer than the attribute. We need to check loops for side effects anyway (if we otherwise default to C++-like rules), and so this intrinsic will do the right thing without any special logic.

 -Hal

> 
> 
> -Hal
> 
> > 
> > Anyways, I've not spent a lot of time thinking about what this
> > might
> > break for languages that allow infinite loops. Maybe it doesn't
> > work
> > as well as I'd hope.
> > 
> > 
> > -Chandler
> > 
> > 
> > On Wed, Jul 15, 2015 at 9:16 PM Hal Finkel < hfinkel at anl.gov >
> > wrote:
> > 
> > 
> > Hello everyone,
> > 
> > The topic of whether or not LLVM allows for infinite loops has come
> > up a lot recently (several times this week already). Regarding
> > motivation, there are two important facts:
> > 
> > 1. Some languages, such as Java, have well-defined infinite loops.
> > See:
> > 
> > http://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.4.9
> > 
> > and:
> > 
> > http://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.4.2
> > 
> > and, as a community, it seems to be important for us to support
> > such
> > languages. That means that we must have a way, at the IR level, to
> > support and model infinite loops.
> > 
> > 2. Other languages, such a C and C++, allow us to assume that
> > otherwise-side-effect-free loops terminate, specifically, for C++,
> > 1.10p27 says:
> > 
> > The implementation may assume that any thread will eventually do
> > one
> > of the following:
> > - terminate
> > - make a call to a library I/O function
> > - access or modify a volatile object, or
> > - perform a synchronization operation or an atomic operation
> > 
> > [Note: This is intended to allow compiler transformations such as
> > removal of empty loops, even
> > when termination cannot be proven. — end note ]
> > 
> > and taking advantage of these guarantees is part of providing a
> > high-quality optimizer for C/C++ programs.
> > 
> > And this leaves us somewhat in a bind. To provide a high-quality
> > C/C++ optimizer, we want to take advantage of this guarantee, but
> > we
> > can't do so in a generic sense without harming our ability to serve
> > as a compiler for other languages.
> > 
> > In 2010, Nick proposed to add a 'halting' attribute that could be
> > added to functions to indicate that they would not execute
> > indefinitely (
> > http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20100705/103670.html
> > ). At the time that the patch was proposed, there were
> > infrastructure problems with inferring the attribute for functions
> > with loops (related to using function-level analysis passes from a
> > CGSCC pass), but hopefully those will be fixed with the new pass
> > manager. Regardless, however, such inference is much more powerful
> > if it can take advantage of the guarantees that C/C++ provide.
> > 
> > Thus, while I would eventually like a 'halting' attribute, or some
> > variant of that (including, for example, the lack of calls to
> > longjmp), I think that a first step is to provide an attribute that
> > Clang, and other frontends, can add when producing IR from sources
> > where the language provides C/C++-like guarantees on loop
> > termination. This attribute would indicate that the function will
> > not execute indefinitely without producing some
> > externally-observable side effect (calling an external function or
> > executing a volatile/atomic memory access). I could name this
> > attribute 'finite', but bikeshedding is welcome.
> > 
> > With such an attribute in place, we would be able to clarify our
> > overall position on infinite loops, be in a stronger position to
> > infer more specific function properties (like halting), and can put
> > in place generally-correct fixes to outstanding bugs (PR24078, for
> > example). I know there are some Clang users who want it to optimize
> > while honoring infinite loops, and I think adding this attribute
> > helps them as well (assuming we'd provide some non-default option
> > to
> > prevent Clang from adding it). Thoughts?
> > 
> > Thanks again,
> > Hal
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> > 
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory