[PATCH] D85091: [Sema, CodeGen] Implement [[likely]] and [[unlikely]] in IfStmt

Mon Aug 17 09:07:50 PDT 2020

aaron.ballman added inline comments.

================
Comment at: clang/test/SemaCXX/attr-likelihood.cpp:101
+}
+#endif
----------------
Quuxplusone wrote:
> aaron.ballman wrote:
> > Mordante wrote:
> > > aaron.ballman wrote:
> > > > Mordante wrote:
> > > > > Quuxplusone wrote:
> > > > > > I'd like to see a case like `if (x) [[likely]] i=1;` just to prove that it works on statements that aren't blocks or empty statements. (My understanding is that this should already work with your current patch.)
> > > > > > 
> > > > > > I'd like to see a case like `if (x) { [[likely]] i=1; }` to prove that it works on arbitrary statements. This //should// have the same effect as `if (x) [[likely]] { i=1; }`. My understanding is that your current patch doesn't get us there //yet//. If it's unclear how we'd get there by proceeding along your current trajectory, then I would question whether we want to commit to this trajectory at all, yet.
> > > > > I agree it would be a good idea to add a test like `if (x) [[likely]] i=1;` but I don't feel adding an additional test in this file proves anything. I'll add a test to `clang/test/CodeGenCXX/attr-likelihood-if-branch-weights.cpp` to prove it not only accepts the code but also honours the attribute. This is especially important due to your correct observation that `if (x) { [[likely]] i=1; }` doesn't have any effect. The code is accepted but the CodeGen won't honour the attribute.
> > > > > 
> > > > > I think we can use the current approach, but I need to adjust `getLikelihood`. Instead of only testing whether the current `Stmt` is an `AttributedStmt` it needs to iterate over all `Stmt`s and test them for the attribute. Obviously it should avoid looking into `Stmt`s that also use the attribute, e.g:
> > > > > ```
> > > > > if(b) {
> > > > >     switch(c) {
> > > > >         [[unlikely]] case 0: ... break; // The attribute "belongs" to the switch not to the if
> > > > >         [[likely]] case 1: ... break; // The attribute "belongs" to the switch not to the if
> > > > >     }
> > > > > }
> > > > > ```
> > > > > 
> > > > > This is especially important due to your correct observation that if (x) { [[likely]] i=1; } doesn't have any effect. 
> > > > 
> > > > This code should diagnose the attribute as being ignored.
> > > What I understood from Arthur and rereading the standard this should work, but the initial version of the patch didn't handle this properly.
> > > What I understood from Arthur and rereading the standard this should work, but the initial version of the patch didn't handle this properly.
> > 
> > My original belief was  that it's acceptable for the attribute to be placed there in terms of syntax, but the recommended practice doesn't apply to this case because there's no alternative paths of execution once you've entered the compound statement. That means:
> > ```
> > if (x) [[likely]] { i = 1; }
> > // is not the same as
> > if (x) { [[likely]] i = 1; }
> > ```
> > That said, I can squint at the words to get your interpretation and your interpretation matches what's in the original paper (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0479r2.html#examples).
> > 
> > Ultimately, I think we should strive for consistency between implementations, and I think we should file an issue with WG21 to clarify the wording once we figure out how implementations want to interpret questions like these.
> I don't know WG21's actual rationale, but I can imagine a programmer wanting to annotate something like this:
> 
>     #define FATAL(msg) [[unlikely]] log_and_abort("Fatal error!", msg);
>     ...
>     auto packet = receive();
>     if (packet.malformed()) {
>         save_to_log(packet);
>         FATAL("malformed packet");
>     }
>     ...
> 
> The "absolute unlikeliness" of the malformed-packet codepath is encapsulated within the `FATAL` macro so that the programmer doesn't have to tediously repeat `[[unlikely]]` at some branch arbitrarily far above the `FATAL` point. Compilers actually do this already, every time they see a `throw` — every `throw` is implicitly "unlikely" and taints its whole codepath. We want to do something similar for `[[unlikely]]`.
> 
> It's definitely going to be unsatisfyingly fuzzy logic, though, similar to how inlining heuristics and `inline` are today.
> 
> My understanding is that
> 
>     if (x) { [[likely]] i = 1; }
>     if (x) [[likely]] { i = 1; }
> 
> have exactly the same surface reading: the same set of codepaths exist in both cases, and the same "x true, i gets 1" codepath is `[[likely]]` in both cases. Of course your heuristic might //choose// to treat them differently, just as you might //choose// to treat
> 
>     struct S { int f() { ... } };
>     struct S { inline int f() { ... } };
> 
> differently for inlining purposes despite their identical surface readings.
> Compilers actually do this already, every time they see a throw — every throw is implicitly "unlikely" and taints its whole codepath. We want to do something similar for [[unlikely]].

Heh, fun story, one of the use cases I was told was critical when discussing this feature in the committee came from the safety-critical space where they want to mark `catch` blocks as being likely in order to force the optimizer to do its best on the failure path, because that's the code path which had the hard requirements for bailing out quickly (so the elevator doesn't kill you, car doesn't crash, etc). They needed to be able to write:
```
try {
  ...
} catch (something& s) [[likely]] {
  ...
} catch (...) [[likely]] {
  ...
}
```

> My understanding is that ... have exactly the same surface reading

I can read the standard to get to that understanding, but am skeptical that programmers will be able to predict what these attributes do anywhere they're written for anything but contrived examples with that interpretation. I think limiting the impact to only being at branch points is at least explicable behavior, otherwise programmers are left reasoning through code like:
```
auto what = []{ FATAL("malformed packet"); };
if (foo) {
  what();
} else {
  if (something) {
    what();
  }
} 
```
(Does this make the `foo` branch unlikely? What impact is there on the `else` branch?) Now replace the lambda with a call to an inline function, or a GNU statement expression, an RAII constructor you can see the definition of, etc and ask the same questions.

My fear is that by going with the least restrictive design, programmers will eventually treat these attributes the same way they treat use of the `register` keyword -- something to be avoided because the best you can hope for is the implementation ignores it.

This isn't to say that I'm *opposed* to that approach, it's more that I'm skeptical of it unless we have some implementation agreement between major compilers about how to set user expectations appropriately.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85091/new/

https://reviews.llvm.org/D85091