[PATCH] D85091: [Sema, CodeGen] Implement [[likely]] and [[unlikely]] in IfStmt

Mon Aug 17 07:59:40 PDT 2020

Quuxplusone added inline comments.

================
Comment at: clang/test/SemaCXX/attr-likelihood.cpp:101
+}
+#endif
----------------
aaron.ballman wrote:
> Mordante wrote:
> > aaron.ballman wrote:
> > > Mordante wrote:
> > > > Quuxplusone wrote:
> > > > > I'd like to see a case like `if (x) [[likely]] i=1;` just to prove that it works on statements that aren't blocks or empty statements. (My understanding is that this should already work with your current patch.)
> > > > > 
> > > > > I'd like to see a case like `if (x) { [[likely]] i=1; }` to prove that it works on arbitrary statements. This //should// have the same effect as `if (x) [[likely]] { i=1; }`. My understanding is that your current patch doesn't get us there //yet//. If it's unclear how we'd get there by proceeding along your current trajectory, then I would question whether we want to commit to this trajectory at all, yet.
> > > > I agree it would be a good idea to add a test like `if (x) [[likely]] i=1;` but I don't feel adding an additional test in this file proves anything. I'll add a test to `clang/test/CodeGenCXX/attr-likelihood-if-branch-weights.cpp` to prove it not only accepts the code but also honours the attribute. This is especially important due to your correct observation that `if (x) { [[likely]] i=1; }` doesn't have any effect. The code is accepted but the CodeGen won't honour the attribute.
> > > > 
> > > > I think we can use the current approach, but I need to adjust `getLikelihood`. Instead of only testing whether the current `Stmt` is an `AttributedStmt` it needs to iterate over all `Stmt`s and test them for the attribute. Obviously it should avoid looking into `Stmt`s that also use the attribute, e.g:
> > > > ```
> > > > if(b) {
> > > >     switch(c) {
> > > >         [[unlikely]] case 0: ... break; // The attribute "belongs" to the switch not to the if
> > > >         [[likely]] case 1: ... break; // The attribute "belongs" to the switch not to the if
> > > >     }
> > > > }
> > > > ```
> > > > 
> > > > This is especially important due to your correct observation that if (x) { [[likely]] i=1; } doesn't have any effect. 
> > > 
> > > This code should diagnose the attribute as being ignored.
> > What I understood from Arthur and rereading the standard this should work, but the initial version of the patch didn't handle this properly.
> > What I understood from Arthur and rereading the standard this should work, but the initial version of the patch didn't handle this properly.
> 
> My original belief was  that it's acceptable for the attribute to be placed there in terms of syntax, but the recommended practice doesn't apply to this case because there's no alternative paths of execution once you've entered the compound statement. That means:
> ```
> if (x) [[likely]] { i = 1; }
> // is not the same as
> if (x) { [[likely]] i = 1; }
> ```
> That said, I can squint at the words to get your interpretation and your interpretation matches what's in the original paper (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0479r2.html#examples).
> 
> Ultimately, I think we should strive for consistency between implementations, and I think we should file an issue with WG21 to clarify the wording once we figure out how implementations want to interpret questions like these.
I don't know WG21's actual rationale, but I can imagine a programmer wanting to annotate something like this:

    #define FATAL(msg) [[unlikely]] log_and_abort("Fatal error!", msg);
    ...
    auto packet = receive();
    if (packet.malformed()) {
        save_to_log(packet);
        FATAL("malformed packet");
    }
    ...

The "absolute unlikeliness" of the malformed-packet codepath is encapsulated within the `FATAL` macro so that the programmer doesn't have to tediously repeat `[[unlikely]]` at some branch arbitrarily far above the `FATAL` point. Compilers actually do this already, every time they see a `throw` — every `throw` is implicitly "unlikely" and taints its whole codepath. We want to do something similar for `[[unlikely]]`.

It's definitely going to be unsatisfyingly fuzzy logic, though, similar to how inlining heuristics and `inline` are today.

My understanding is that

    if (x) { [[likely]] i = 1; }
    if (x) [[likely]] { i = 1; }

have exactly the same surface reading: the same set of codepaths exist in both cases, and the same "x true, i gets 1" codepath is `[[likely]]` in both cases. Of course your heuristic might //choose// to treat them differently, just as you might //choose// to treat

    struct S { int f() { ... } };
    struct S { inline int f() { ... } };

differently for inlining purposes despite their identical surface readings.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85091/new/

https://reviews.llvm.org/D85091