[PATCH] D86559: [Sema, CodeGen] Allow [[likely]] and [[unlikely]] on labels

Thu Aug 27 10:21:03 PDT 2020

Mordante planned changes to this revision.
Mordante added a comment.

> That is exactly the behavior I am coming to believe we should follow. You can either write it on a compound statement that is guarded by a flow control decision (`if`/`else`/`while`) or you can write it on a label, otherwise the attribute is parsed and ignored (with a diagnostic). This feels like the right mixture of useful with understandable semantics so far, but perhaps we'll find other examples that change our minds.
>
> The fallthrough behavior question has one more case we may want to think about:
>
>   switch (x) {
>   case 0:
>   case 1:
>   [[likely]] case 2: break;
>   [[unlikely]] default:
>   }
>
> Does this mark cases `0` and `1` as being likely or not? I could see users wanting to use shorthand rather than repeat themselves on all the cases. However, I'm also not certain whether there would be any performance impact if we marked only `case 2` as likely and left cases `0` and `1` with default likelihood. My gut feeling is that this should only mark `case 2`, but others may have different views.

Not according to the standard: "A path of execution includes a label if and only if it contains a jump to that label."
However for an if statement I use 2 weights: 2000 for likely, 1 for unlikely. (These can be configured by a command-line option already used for `__builtin_expect`.)
For a switch I intent to use 3 weights: 2000 for likely, (2000 + 1)/2 for no likelihood, 1 for unlikely. `__builtin_expect` doesn't have a 'neutral' values since in a switch you can only mark one branch as likely. Instead of adding a new option I picked the midpoint.
So the weights in your example would be:

  > switch (x) {
  case 0:  // 1000
  case 1:  // 1000
  [[likely]] case 2: break; // 2000
  [[unlikely]] default: // 1
   }

In this case the user could also write:

  > switch (x) {
  case 0:  // 1000
  case 1:  // 1000
  case 2: break; // 1000
  [[unlikely]] default: // 1
   }

where the values 0, 1, 2 still would be more likely than the default statement. Obviously this will be documented when I implement the switch.
How do you feel about this approach?

>> I fully agree the behaviour mandated by the standard is way too complex and user unfriendly. It would have been nice if there were simpler rules, making it easier to use and to teach. Still I think it would be best to use the complex approach now, since that's what the standard specifies. During that process we can see whether there are more pitfalls. Then we can discuss it with other vendors and see whether we can change the wording of the standard. Do you agree?
>
> The only requirement from the standard is that we parse `[[likely]]` or `[[unlikely]]` on a statement or label, that we don't allow conflicting attributes in the same attribute sequence, and that the attribute has no arguments to it. All the rest is recommended practice that's left up to the implementation as a matter of QoI. So we conform to the standard by following the approach on compound statements and labels + the constraint checking. We may not be following the recommended practice precisely, but we may not *want* to follow the recommended practice because it's bad for usability. So I'd like us to design the feature that we think gives the most value and is the easiest to use that matches the recommended practice as best we can, then start talking to other vendors. If other vendors don't want to follow our design, that's totally fine, but we should ensure our design *allows* users to write code that will behave similarly for other implementations without diagnostics. e.g., we don't want to design something where the user has to use macros to avoid diagnostics in cross-compiler code. After that, we may want to propose some changes to the recommended practice to WG21.
>
> From my current thinking, it seems that there may be agreement that allowing these on arbitrary statements may be really difficult for users to understand the behavior of and that compound statements and labels are what we think is an understandable and useful feature and is also a strict subset of what we COULD support. By that, I think we should limit the feature to just compound statements and labels; this leaves the door open to allow the attributes on arbitrary statements in the future without breaking code. If we allow on arbitrary statements from the start, we can't later restrict the feature because we'd break code. This lets us start getting field experience with that behavior to see how we like it, which may also help when talking to other vendors because we'll have something concrete to talk about (hopefully). WDYT?

I'm somewhat on the fence. In general I prefer to implement what the standard expects. However the more I try to do that, the more I'm convinced what the standard suggests is difficult to achieve and more importantly to difficult explain to C++ users. In order to implement this feature properly I've been looking at using the CFG. That allows me to follow the path of execution properly. However it makes the feature more complicated. AFAIK the CFG is only used in Clang for `AnalysisBasedWarnings::IssueWarnings`. This feels like a hint, this is not the proper solution. Another "interesting" issue with the current approach is, that it'll be hard to determine the likelihood when looking at the code:

  if(a) {
      foo(a);
      [[likely]] ++a;
  }

If `foo` is declared `[[noreturn]]` the `++a` is never executed so the attribute shouldn't affect the likelihood. (This is of course great when creating C++ trivia, but not great for users.)
Another issue is to properly implement it when a branch has as a simple `if` inside it:

  if(a > 5) {
      if(a < 10)
          ++a;
      [[likely]] ++a;
  }

Obviously the attribute is in the `a > 5` branch and will always be executed, but I haven't found a solution to figure that out using the CFG. I need to find a way to determine the `a < 10` will always resume the original branch. I'm convinced it can be done, but it made me wonder whether this is the best approach.

So I think it would be wise follow your suggestion. Implement a simple version and try to convince other vendors to follow suit. This leaves the door open to implement a more complex solution if needed. I think this approach does C++ users and ourselves a favour. I only feel we need a slight relaxation in the requirements. In your approach the attribute is only allowed on a compound statement. I would like to allow it on the first statement, like it was in my initial patch. This allows using the attribute without being forced to use a compound statement. I feel forcing users to use a certain coding style is a bad idea.

  if(a) [[likely]] { // Good allowed on the compound statement
    ...
  }

  if(a)
      [[likely]] return; // Good allowed on the first statement

  if(a) {
      [[likely]] return; // Ignored, not on the first statement
  }

  if(a)                      // Attribute not allowed on a declaration,
      [[likely]] int i = 5;  // but I can't think about a good use-case
                             // for this code.

  if(a) [[likely]] { // Good allowed on the compound statement
      int i = 5;     // Now i seems to have a purpose
    ...
  }

  if(a) [[likely]] {} // A diagnostic is issued and the attributes
  else [[likely]]] {} // are ignored

The diagnostics for ignored attributes are enabled by default so contradiction is visible by default. For the `switch` I'll use the previous proposal.
WDYT?

If C++20 was still in development I would even consider to write a proposal to make the attributes more like `__builtin_expect`:

  [[likely]] if(a) {
    // Considered to be arbitrarily likely
  } else {
    // Considered to be arbitrarily unlikely
  }

  if(a) {
  } [[unlikely]] else { // Not allowed on the else statement
  }

  if(a) {
    // No likelihood suggestion
  } else
    // No likelihood suggestion

    [[unlikely]] if{
    // Considered to be arbitrarily likely
  }

But it's too late to consider this change.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86559/new/

https://reviews.llvm.org/D86559