[llvm-dev] [RFC] Introducing the maynotprogress IR attribute

Johannes Doerfert via llvm-dev llvm-dev at lists.llvm.org
Thu Sep 10 15:26:52 PDT 2020


Please scroll to the second to last paragraph for an action item!


On 9/10/20 4:49 PM, James Y Knight wrote:
 > On Mon, Sep 7, 2020 at 5:17 PM Johannes Doerfert via llvm-dev <
 > llvm-dev at lists.llvm.org> wrote:
 >
 >> Metadata to encode what exactly?
 >>
 >> The idea is that loops which do not match the `maynotprogress` defined
 >> via the attribute can be annotated. Thus, if the function is not
 >> `maynotprogress` there is no need for the annotation. If the function
 >> is, loops that do not match `maynotprogress` would at some point be
 >> annotated, either by the frontend or inliner (= the two places that add
 >> `maynotprogress`).
 >>
 >> You cannot use metadata in the other direction, thus to opt-in to
 >> `maynotprogress`. You can make `maynotprogress` the default and use
 >> metadata to opt-out, sure. In addition to the regressions we might see,
 >> we basically end up with metadata on almost every C/C++ loop, unsure if
 >> that is a good idea.
 >
 >
 > So, just to reiterate, the proposal is that Clang will look at each
 > function, and:
 > - if a function contains any loops which may not progress, will emit
 > "maynotprogress" attribute on the function definition,

Yes. See also below.

 > and "mustprogress" metadata on each loop other than those that may be
 > infinite.

The metadata is added to loops that have to make progress but are in
a function in which loops do not have to make progress. It is not
"needed" but an optimization.

Example:

void foo(int N) {
   for(int i = 0; i < N; ++i) // <- loop.metadata
     bar(i);
   int M = N + 1;
   while (N < M);             // <- loop.metadata, would be removed
   if (N)
     while (1);
   while (N < M);             // <- loop.metadata, would be removed
}


 > - otherwise, it will not emit the function attribute 
"maynotprogress", and
 > will not emit "mustprogress" metadata on the loops.

Yes. If there is no loop in the code with well-defined semantics if it
is infinite, Clang won't do anything different.

Loops with well-defined semantics if they are infinite are C loops with
a constant conditional (=true), we could include all loops (C++,...)
with a constant conditional while we are at it IMHO.


 > And then, whenever LLVM inlines code into another function:
 > - If the outer function has maynotprogress, and the inner does not: add
 > mustprogress metadata to all the loops (?) from the inner function.

Right. We don't have the patch for this yet I think.


 > - If the outer function does not have maynotprogress, and the inner does:
 > add maynotprogress to the outer function, and add mustprogress to all 
loops
 > in the outer.

Right. We have a patch for this.


 > (But, how do you identify the "loop" to put the metadata on?)

For now, LoopInfo. There is no need to get them all, we can miss some.
To be fair, we don't optimize "loops" that are not recognized by
LoopInfo.


 > An alternative scheme suggested was to do without the function attribute,
 > and for clang to emit "mustprogress" metadata on every loop which has 
this
 > property.

Right: We can make the universal default `maynotprogress`, so the loops
are well-defined if they are infinite and do nothing. (That means calls
that do not have side effects are then something that cannot be removed
if they might loop indefinitely.) Then we must "opt-in" to all
optimizations, e.g., loop-deletion and more importantly call deletion.
There were ideas to use an attribute to "opt-in" and some to use
metadata, or both.


 > Some potential reasons to want not to do the simpler scheme have been
 > expressed:
 > - Would deoptimize code emitted by previous frontends which don't know
 > about mustprogress.
 > - Concern that the loop metadata may be lost during other transforms, 
thus
 > losing the optimization.
 >
 >
 > I'd like to raise another issue, that I haven't seen anyone bring up 
yet --
 > the definition of a "loop" in the context of these "infinite loop"
 > optimizations does not seem to have been made clear, and, I think, is
 > problematic. In the current state of LLVM, I believe a candidate 
"loop" to
 > delete can be derived from *any* control flow, it does not need to be
 > derived from an explicit source-language loop construct. Under this
 > proposal, that would still be the case. (unless the function was 
annotated
 > with maynotprogress.)
 >
 > For C code, that is incorrect. C only has a must-progress requirement
 > explicitly for iteration statements ("for", "while", and "do"). (And of
 > those, only ones which do not have a constant controlling expression).
 > Quoting C11, 6.8.5 Iteration statements, p6, "An iteration statement 
whose
 > controlling expression is not a constant expression, that performs no
 > input/output operations, does not access volatile objects, and 
performs no
 > synchronization or atomic operations in its body, controlling expression,
 > or (in the case of for statement) its expression-3, may be assumed by the
 > implementation to terminate."
 >
 > So, compiling C code, ISTM that every C function needs to be
 > "maynotprogress", and *only* for "for", "while", or "do" loops without a
 > constant loop-controlling-expression, can we add the "mustprogress"
 > metadata.
 >
 > On the other hand, in C++, it would appear that by [intro.progress] there
 > is a guarantee of progress for *all* control flow in a function, even if
 > you've built it out of explicit goto statements.
 >
 > That is, I believe in C++ this function can be optimized away, but 
not in C.
 >   void loop_weirdly(int should_loop) {
 >       label:
 >       if (should_loop) goto label;
 >   }
 >
 > While this may be optimized away in both C and C++:
 >   void loop(int should_loop) {
 >     while (should_loop) ;
 >   }

This example is great; the compiler council [0] is divided:

  Clang + icc*: keep the loop in loop_weirdly but remove calls to it,
                both in C and C++.
           gcc: follows your interpretation, in C the termination is
                well-defined behavior and both the loop in loop_weirdly
                as well as a call of it are kept. In C++ both are
                removed.
          MSVC: plays it conservative and doesn't remove anything in
                either mode.

*icc13 removes the loop in loop_weirdly.

I was going to provide options what this means for this RFC and existing
optimizations we do right now, but I think it would be better if people
would first chime in and we agree on what behavior is "correct".


Thanks
   Johannes



[0] https://godbolt.org/z/KosGnE


 > Another question -- can we actually be sure that other control flow won't
 > get merged into the loop -- is it possible for the "mustprogress" 
metadata
 > to get applied to a wider scope than it should've been, after other 
control
 > flow optimizations occur?

I think we can be sure this is OK if all transformations behave as they
already have to. While you can imagine a transformation changing the
loop in different ways, let's say fusion, metadata cannot be kept if it
is potentially invalidated. This is already true for all the other ones
we have, e.g., to override the vectorizer legality checks.




More information about the llvm-dev mailing list