[llvm-dev] [RFC] Introducing the maynotprogress IR attribute

Hal Finkel via llvm-dev llvm-dev at lists.llvm.org
Thu Sep 10 16:53:24 PDT 2020


On 9/10/20 5:26 PM, Johannes Doerfert wrote:
> Please scroll to the second to last paragraph for an action item!
>
>
> On 9/10/20 4:49 PM, James Y Knight wrote:
> > On Mon, Sep 7, 2020 at 5:17 PM Johannes Doerfert via llvm-dev <
> > llvm-dev at lists.llvm.org> wrote:
> >
> >> Metadata to encode what exactly?
> >>
> >> The idea is that loops which do not match the `maynotprogress` defined
> >> via the attribute can be annotated. Thus, if the function is not
> >> `maynotprogress` there is no need for the annotation. If the function
> >> is, loops that do not match `maynotprogress` would at some point be
> >> annotated, either by the frontend or inliner (= the two places that 
> add
> >> `maynotprogress`).
> >>
> >> You cannot use metadata in the other direction, thus to opt-in to
> >> `maynotprogress`. You can make `maynotprogress` the default and use
> >> metadata to opt-out, sure. In addition to the regressions we might 
> see,
> >> we basically end up with metadata on almost every C/C++ loop, 
> unsure if
> >> that is a good idea.
> >
> >
> > So, just to reiterate, the proposal is that Clang will look at each
> > function, and:
> > - if a function contains any loops which may not progress, will emit
> > "maynotprogress" attribute on the function definition,
>
> Yes. See also below.
>
> > and "mustprogress" metadata on each loop other than those that may be
> > infinite.
>
> The metadata is added to loops that have to make progress but are in
> a function in which loops do not have to make progress. It is not
> "needed" but an optimization.
>
> Example:
>
> void foo(int N) {
>   for(int i = 0; i < N; ++i) // <- loop.metadata
>     bar(i);
>   int M = N + 1;
>   while (N < M);             // <- loop.metadata, would be removed
>   if (N)
>     while (1);
>   while (N < M);             // <- loop.metadata, would be removed
> }
>
>
> > - otherwise, it will not emit the function attribute 
> "maynotprogress", and
> > will not emit "mustprogress" metadata on the loops.
>
> Yes. If there is no loop in the code with well-defined semantics if it
> is infinite, Clang won't do anything different.
>
> Loops with well-defined semantics if they are infinite are C loops with
> a constant conditional (=true), we could include all loops (C++,...)
> with a constant conditional while we are at it IMHO.
>
>
> > And then, whenever LLVM inlines code into another function:
> > - If the outer function has maynotprogress, and the inner does not: add
> > mustprogress metadata to all the loops (?) from the inner function.
>
> Right. We don't have the patch for this yet I think.
>
>
> > - If the outer function does not have maynotprogress, and the inner 
> does:
> > add maynotprogress to the outer function, and add mustprogress to 
> all loops
> > in the outer.
>
> Right. We have a patch for this.
>
>
> > (But, how do you identify the "loop" to put the metadata on?)
>
> For now, LoopInfo. There is no need to get them all, we can miss some.
> To be fair, we don't optimize "loops" that are not recognized by
> LoopInfo.
>
>
> > An alternative scheme suggested was to do without the function 
> attribute,
> > and for clang to emit "mustprogress" metadata on every loop which 
> has this
> > property.
>
> Right: We can make the universal default `maynotprogress`, so the loops
> are well-defined if they are infinite and do nothing. (That means calls
> that do not have side effects are then something that cannot be removed
> if they might loop indefinitely.) Then we must "opt-in" to all
> optimizations, e.g., loop-deletion and more importantly call deletion.
> There were ideas to use an attribute to "opt-in" and some to use
> metadata, or both.
>
>
> > Some potential reasons to want not to do the simpler scheme have been
> > expressed:
> > - Would deoptimize code emitted by previous frontends which don't know
> > about mustprogress.
> > - Concern that the loop metadata may be lost during other 
> transforms, thus
> > losing the optimization.
> >
> >
> > I'd like to raise another issue, that I haven't seen anyone bring up 
> yet --
> > the definition of a "loop" in the context of these "infinite loop"
> > optimizations does not seem to have been made clear, and, I think, is
> > problematic. In the current state of LLVM, I believe a candidate 
> "loop" to
> > delete can be derived from *any* control flow, it does not need to be
> > derived from an explicit source-language loop construct. Under this
> > proposal, that would still be the case. (unless the function was 
> annotated
> > with maynotprogress.)
> >
> > For C code, that is incorrect. C only has a must-progress requirement
> > explicitly for iteration statements ("for", "while", and "do"). (And of
> > those, only ones which do not have a constant controlling expression).
> > Quoting C11, 6.8.5 Iteration statements, p6, "An iteration statement 
> whose
> > controlling expression is not a constant expression, that performs no
> > input/output operations, does not access volatile objects, and 
> performs no
> > synchronization or atomic operations in its body, controlling 
> expression,
> > or (in the case of for statement) its expression-3, may be assumed 
> by the
> > implementation to terminate."
> >
> > So, compiling C code, ISTM that every C function needs to be
> > "maynotprogress", and *only* for "for", "while", or "do" loops 
> without a
> > constant loop-controlling-expression, can we add the "mustprogress"
> > metadata.
> >
> > On the other hand, in C++, it would appear that by [intro.progress] 
> there
> > is a guarantee of progress for *all* control flow in a function, 
> even if
> > you've built it out of explicit goto statements.
> >
> > That is, I believe in C++ this function can be optimized away, but 
> not in C.
> >   void loop_weirdly(int should_loop) {
> >       label:
> >       if (should_loop) goto label;
> >   }
> >
> > While this may be optimized away in both C and C++:
> >   void loop(int should_loop) {
> >     while (should_loop) ;
> >   }
>
> This example is great; the compiler council [0] is divided:
>
>  Clang + icc*: keep the loop in loop_weirdly but remove calls to it,
>                both in C and C++.
>           gcc: follows your interpretation, in C the termination is
>                well-defined behavior and both the loop in loop_weirdly
>                as well as a call of it are kept. In C++ both are
>                removed.


This seems like a reasonable interpretation of the wording of the 
specifications (although it would be nicer if C included this in its 
list of observable program behaviors to make things clearer).

  -Hal



> MSVC: plays it conservative and doesn't remove anything in
>                either mode.
>
> *icc13 removes the loop in loop_weirdly.
>
> I was going to provide options what this means for this RFC and existing
> optimizations we do right now, but I think it would be better if people
> would first chime in and we agree on what behavior is "correct".
>
>
> Thanks
>   Johannes
>
>
>
> [0] https://godbolt.org/z/KosGnE
>
>
> > Another question -- can we actually be sure that other control flow 
> won't
> > get merged into the loop -- is it possible for the "mustprogress" 
> metadata
> > to get applied to a wider scope than it should've been, after other 
> control
> > flow optimizations occur?
>
> I think we can be sure this is OK if all transformations behave as they
> already have to. While you can imagine a transformation changing the
> loop in different ways, let's say fusion, metadata cannot be kept if it
> is potentially invalidated. This is already true for all the other ones
> we have, e.g., to override the vectorizer legality checks.
>
>
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory



More information about the llvm-dev mailing list