[PATCH] D85091: [Sema, CodeGen] Implement [[likely]] and [[unlikely]] in IfStmt

Aaron Ballman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 26 07:26:43 PDT 2020


aaron.ballman added a comment.

In D85091#2232588 <https://reviews.llvm.org/D85091#2232588>, @Mordante wrote:

> The current implementation matches GCC's behaviour. This behaviour aligns best with my interpretation of the standard and GCC seems to have the most mature implementation. I would like to get the current implementation in Clang and proceed to work on the missing pieces. In parallel we can align with the other vendors and, if needed, adjust our implementation.

I agree with this direction.



================
Comment at: clang/include/clang/Basic/Attr.td:1288
+def Likely : StmtAttr {
+  let Spellings = [CXX11<"", "likely", 201803>];
+  let Documentation = [LikelihoodDocs];
----------------
Mordante wrote:
> Mordante wrote:
> > aaron.ballman wrote:
> > > Mordante wrote:
> > > > Mordante wrote:
> > > > > aaron.ballman wrote:
> > > > > > Mordante wrote:
> > > > > > > aaron.ballman wrote:
> > > > > > > > Hmm, I'm on the fence about specifying `201803` for these attributes. Given that this is only the start of supporting the attribute, do we want to claim it already matches the standard's behavior? Or do we just want to return `1` to signify that we understand this attribute but we don't yet fully support it in common cases (such as on labels in switch statements, etc)?
> > > > > > > > 
> > > > > > > > As another question, should we consider adding a C2x spelling `[[clang::likely]]` and `[[clang::unlikely]]` to add this functionality to C?
> > > > > > > I was also somewhat on the fence, for now I'll change it to specify `1`.
> > > > > > > 
> > > > > > > I'll have a look at the C2x changes. Just curious do you know whether there's a proposal to add this to C2x?
> > > > > > > I'll have a look at the C2x changes. Just curious do you know whether there's a proposal to add this to C2x?
> > > > > > 
> > > > > > There isn't one currently because there is no implementation experience with the attributes in C. This is a way to get that implementation experience so it's easier to propose the feature to WG14.
> > > > > > I was also somewhat on the fence, for now I'll change it to specify `1`.
> > > > > 
> > > > > There seems to be an issue using the `1` so I used `2` instead. (Didn't want to look closely at the issue.)
> > > > > 
> > > > > > I'll have a look at the C2x changes. Just curious do you know whether there's a proposal to add this to C2x?
> > > > > 
> > > > > There isn't one currently because there is no implementation experience with the attributes in C. This is a way to get that implementation experience so it's easier to propose the feature to WG14.
> > > > 
> > > > Nice to know. It seems the C2x wasn't at straightforward as I hoped, so I didn't implement it yet. I intend to look at it later. I first want the initial part done to see whether this is the right approach.
> > > What issues are you running into? 1 is the default value when you don't specify a value specifically.
> > It give the error "clang/include/clang/Basic/Attr.td:265:7: error: Standard attributes must have valid version information."
> > > > I'll have a look at the C2x changes. Just curious do you know whether there's a proposal to add this to C2x?
> > > 
> > > There isn't one currently because there is no implementation experience with the attributes in C. This is a way to get that implementation experience so it's easier to propose the feature to WG14.
> > 
> > Nice to know. It seems the C2x wasn't at straightforward as I hoped, so I didn't implement it yet. I intend to look at it later. I first want the initial part done to see whether this is the right approach.
> 
> I had another look and I got it working in C.
If you leave the version number off entirely, it defaults to 1.


================
Comment at: clang/include/clang/Basic/AttrDocs.td:1691
+The ``likely`` and ``unlikely`` attributes are used as compiler hints. When
+the next executed statement depends on a condition this attribute can
+annotate all possible statements with either ``likely`` or ``unlikely``.
----------------
I think this paragraph is misleading because the attribute no longer impacts the statement, it impacts the *entire branch the statement is contained within*.


================
Comment at: clang/include/clang/Basic/AttrDocs.td:1696
+
+It's not allowed to annotate a statement with both ``likely`` and
+``unlikely``.
----------------
You should clarify here that you CAN annotate multiple statements in the same block and what we do about it. e.g.,
```
if (foo) {
  [[likely]];
  [[unlikely]];
  bar();
}
```
Note, I doubt this will happen as obviously as this example. More likely what will happen is:
```
#define ERROR_UNLESS(x) [[unlikely]] assert(x)

if (foo) {
  [[likely]];
  ERROR_UNLESS(baz == 12);
  bar();
}
```


================
Comment at: clang/include/clang/Basic/AttrDocs.td:1699
+
+These attributes have no effect when using PGO or optimization level 0.
+
----------------
Spell out PGO and "or optimization" -> "or at optimization"


================
Comment at: clang/include/clang/Basic/AttrDocs.td:1726
+  }
+  }];
+}
----------------
Something else we should document is what our behavior is when the attribute is not immediately inside of an `if` or `else` statement. e.g.,
```
void func() { // Does not behave as though specified with [[gnu::hot]]
  [[likely]];
}

void func() {
  if (x) {
    {
      [[likely]]; // Does this make the if branch likely?
      SomeRAIIObj Obj;
    }
  } else {
  }
}

void func() {
  if (x) {
    int y = ({[[likely]]; /* Does this make the if branch likely? */ 1;});
  } else {
  }
}
```


================
Comment at: clang/lib/CodeGen/CGStmt.cpp:656
+namespace {
+//! Searches of the likelihood attributes of a statement.
+//!
----------------
`//!` -> `//`


================
Comment at: clang/lib/CodeGen/CGStmt.cpp:691
+      Visit(S);
+      if (Result != None)
+        return;
----------------
Is early return the correct logic? Won't that give surprising results in the case the programmer has both attributes in the same compound statement? I think we need to look over all the statements in the current block, increment a counter when we hit `[[likely]]`, decrement the counter when we hit `[[unlikely]]` and return whether the counter is 0, negative (unlikely), or positive (likely).


================
Comment at: clang/lib/CodeGen/CGStmt.cpp:723
+  // The user can use conflicting likelihood attributes within one of the
+  // statements or between the statements. These conflicts are ignored and the
+  // first match is used.
----------------
I do not think conflicts should result in the first match being used (unless we're diagnosing the situation, at which point some of this code should be lifted into Sema and set a bit on an AST node that can be read during codegen time), so this comment may need to be updated.


================
Comment at: clang/lib/CodeGen/CGStmt.cpp:733
+  if (Else) {
+    LH = LH = getLikelihood(Else);
+    if (LH.first)
----------------
`LH = LH`?


================
Comment at: clang/lib/CodeGen/CGStmt.cpp:741
+  }
+  return Optional<std::pair<uint32_t, uint32_t>>{};
+}
----------------
You should return `None` here.


================
Comment at: clang/lib/CodeGen/CGStmt.cpp:791
+  llvm::MDNode *Weights = nullptr;
+  uint64_t Count = getProfileCount(S.getThen());
+  if (!Count && CGM.getCodeGenOpts().OptimizationLevel) {
----------------
Perhaps not for this patch, but I wonder if we'd be doing users a good deed by alerting them when PGO weights do not match the attribute specified. e.g., if an attribute says "this branch is likely" and PGO shows it's unlikely, that seems like something the programmer may wish to know. WDYT?


================
Comment at: clang/lib/Sema/SemaStmtAttr.cpp:345
+        S.Diag(Attr->getLocation(), diag::err_attributes_are_not_compatible)
+            << Attr->getSpelling() << Unlikely->getSpelling()
+            << Attr->getRange();
----------------
I believe you can drop the calls to `getSpelling()` and just pass in the `Attr*` directly. The diagnostic engine takes care of printing the attribute properly.


================
Comment at: clang/test/SemaCXX/attr-likelihood.cpp:101
+}
+#endif
----------------
Mordante wrote:
> Mordante wrote:
> > aaron.ballman wrote:
> > > Quuxplusone wrote:
> > > > aaron.ballman wrote:
> > > > > Mordante wrote:
> > > > > > aaron.ballman wrote:
> > > > > > > Mordante wrote:
> > > > > > > > Quuxplusone wrote:
> > > > > > > > > I'd like to see a case like `if (x) [[likely]] i=1;` just to prove that it works on statements that aren't blocks or empty statements. (My understanding is that this should already work with your current patch.)
> > > > > > > > > 
> > > > > > > > > I'd like to see a case like `if (x) { [[likely]] i=1; }` to prove that it works on arbitrary statements. This //should// have the same effect as `if (x) [[likely]] { i=1; }`. My understanding is that your current patch doesn't get us there //yet//. If it's unclear how we'd get there by proceeding along your current trajectory, then I would question whether we want to commit to this trajectory at all, yet.
> > > > > > > > I agree it would be a good idea to add a test like `if (x) [[likely]] i=1;` but I don't feel adding an additional test in this file proves anything. I'll add a test to `clang/test/CodeGenCXX/attr-likelihood-if-branch-weights.cpp` to prove it not only accepts the code but also honours the attribute. This is especially important due to your correct observation that `if (x) { [[likely]] i=1; }` doesn't have any effect. The code is accepted but the CodeGen won't honour the attribute.
> > > > > > > > 
> > > > > > > > I think we can use the current approach, but I need to adjust `getLikelihood`. Instead of only testing whether the current `Stmt` is an `AttributedStmt` it needs to iterate over all `Stmt`s and test them for the attribute. Obviously it should avoid looking into `Stmt`s that also use the attribute, e.g:
> > > > > > > > ```
> > > > > > > > if(b) {
> > > > > > > >     switch(c) {
> > > > > > > >         [[unlikely]] case 0: ... break; // The attribute "belongs" to the switch not to the if
> > > > > > > >         [[likely]] case 1: ... break; // The attribute "belongs" to the switch not to the if
> > > > > > > >     }
> > > > > > > > }
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > This is especially important due to your correct observation that if (x) { [[likely]] i=1; } doesn't have any effect. 
> > > > > > > 
> > > > > > > This code should diagnose the attribute as being ignored.
> > > > > > What I understood from Arthur and rereading the standard this should work, but the initial version of the patch didn't handle this properly.
> > > > > > What I understood from Arthur and rereading the standard this should work, but the initial version of the patch didn't handle this properly.
> > > > > 
> > > > > My original belief was  that it's acceptable for the attribute to be placed there in terms of syntax, but the recommended practice doesn't apply to this case because there's no alternative paths of execution once you've entered the compound statement. That means:
> > > > > ```
> > > > > if (x) [[likely]] { i = 1; }
> > > > > // is not the same as
> > > > > if (x) { [[likely]] i = 1; }
> > > > > ```
> > > > > That said, I can squint at the words to get your interpretation and your interpretation matches what's in the original paper (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0479r2.html#examples).
> > > > > 
> > > > > Ultimately, I think we should strive for consistency between implementations, and I think we should file an issue with WG21 to clarify the wording once we figure out how implementations want to interpret questions like these.
> > > > I don't know WG21's actual rationale, but I can imagine a programmer wanting to annotate something like this:
> > > > 
> > > >     #define FATAL(msg) [[unlikely]] log_and_abort("Fatal error!", msg);
> > > >     ...
> > > >     auto packet = receive();
> > > >     if (packet.malformed()) {
> > > >         save_to_log(packet);
> > > >         FATAL("malformed packet");
> > > >     }
> > > >     ...
> > > > 
> > > > The "absolute unlikeliness" of the malformed-packet codepath is encapsulated within the `FATAL` macro so that the programmer doesn't have to tediously repeat `[[unlikely]]` at some branch arbitrarily far above the `FATAL` point. Compilers actually do this already, every time they see a `throw` — every `throw` is implicitly "unlikely" and taints its whole codepath. We want to do something similar for `[[unlikely]]`.
> > > > 
> > > > It's definitely going to be unsatisfyingly fuzzy logic, though, similar to how inlining heuristics and `inline` are today.
> > > > 
> > > > My understanding is that
> > > > 
> > > >     if (x) { [[likely]] i = 1; }
> > > >     if (x) [[likely]] { i = 1; }
> > > > 
> > > > have exactly the same surface reading: the same set of codepaths exist in both cases, and the same "x true, i gets 1" codepath is `[[likely]]` in both cases. Of course your heuristic might //choose// to treat them differently, just as you might //choose// to treat
> > > > 
> > > >     struct S { int f() { ... } };
> > > >     struct S { inline int f() { ... } };
> > > > 
> > > > differently for inlining purposes despite their identical surface readings.
> > > > Compilers actually do this already, every time they see a throw — every throw is implicitly "unlikely" and taints its whole codepath. We want to do something similar for [[unlikely]].
> > > 
> > > Heh, fun story, one of the use cases I was told was critical when discussing this feature in the committee came from the safety-critical space where they want to mark `catch` blocks as being likely in order to force the optimizer to do its best on the failure path, because that's the code path which had the hard requirements for bailing out quickly (so the elevator doesn't kill you, car doesn't crash, etc). They needed to be able to write:
> > > ```
> > > try {
> > >   ...
> > > } catch (something& s) [[likely]] {
> > >   ...
> > > } catch (...) [[likely]] {
> > >   ...
> > > }
> > > ```
> > > 
> > > > My understanding is that ... have exactly the same surface reading
> > > 
> > > I can read the standard to get to that understanding, but am skeptical that programmers will be able to predict what these attributes do anywhere they're written for anything but contrived examples with that interpretation. I think limiting the impact to only being at branch points is at least explicable behavior, otherwise programmers are left reasoning through code like:
> > > ```
> > > auto what = []{ FATAL("malformed packet"); };
> > > if (foo) {
> > >   what();
> > > } else {
> > >   if (something) {
> > >     what();
> > >   }
> > > } 
> > > ```
> > > (Does this make the `foo` branch unlikely? What impact is there on the `else` branch?) Now replace the lambda with a call to an inline function, or a GNU statement expression, an RAII constructor you can see the definition of, etc and ask the same questions.
> > > 
> > > My fear is that by going with the least restrictive design, programmers will eventually treat these attributes the same way they treat use of the `register` keyword -- something to be avoided because the best you can hope for is the implementation ignores it.
> > > 
> > > This isn't to say that I'm *opposed* to that approach, it's more that I'm skeptical of it unless we have some implementation agreement between major compilers about how to set user expectations appropriately.
> > > > Compilers actually do this already, every time they see a throw — every throw is implicitly "unlikely" and taints its whole codepath. We want to do something similar for [[unlikely]].
> > > 
> > > Heh, fun story, one of the use cases I was told was critical when discussing this feature in the committee came from the safety-critical space where they want to mark `catch` blocks as being likely in order to force the optimizer to do its best on the failure path, because that's the code path which had the hard requirements for bailing out quickly (so the elevator doesn't kill you, car doesn't crash, etc). They needed to be able to write:
> > > ```
> > > try {
> > >   ...
> > > } catch (something& s) [[likely]] {
> > >   ...
> > > } catch (...) [[likely]] {
> > >   ...
> > > }
> > > ```
> > Interesting use case, not sure how feasible it would be to implement it. I don't know how much it would affect the implementation if the exception handling. I tested with GCC and ICC and this doesn't compile. Looking at the standard it shouldn't compile. `catch` http://eel.is/c++draft/except.pre requires a compound statement http://eel.is/c++draft/stmt.block. (Obviously the attribute can be placed in the compound statement.)
> > 
> > A use-case I found where placing the attribute inside the compound statement is when using `goto`. I think this might be especially interesting when using it in C code.
> > ```
> > int foo()
> > {
> >   int result = 0;
> >   struct bar *ptr;
> >   
> >   ptr = malloc(sizeof(struct bar));
> >   if(!ptr) {
> >       result = -ENOMEM;
> >       goto err;
> >   }
> > 
> >   ...
> >   if(!f(ptr)) {
> >     result = -ENOMDEV;
> >     goto err_free;
> >   }
> > 
> > [[unlikely]] err_free:
> >   free(ptr);  
> > [[unlikely]] err:
> >    return result; 
> > }
> > ```
> > This allows to mark the error paths as `[[unlikely]]` by only placing the attribute on the label and every `goto` will mark the branch unlikely.
> > Ultimately, I think we should strive for consistency between implementations, and I think we should file an issue with WG21 to clarify the wording once we figure out how implementations want to interpret questions like these.
> 
> I agree having consistency would be a good.  I did some testing.
> It seems GCC's implementation is quite mature and does indeed handle the attributes inside the compound statements.
> It seems MSVC's implementation seems a WIP.
> It seems ICC accepts the attribute but it doesn't seem to change its branch weights.
> 
> What would be the best way to discuss this question with the other vendors? I assume I first need to write a paper, but would it then be best to discuss this with the other vendors or directly send it to the committee?
> 
> 
> What would be the best way to discuss this question with the other vendors? I assume I first need to write a paper, but would it then be best to discuss this with the other vendors or directly send it to the committee?

I would discuss directly with vendors first so that vendors can talk about what situations they are able to support and how, and then a paper could come second which details any suggested changes or open questions.

That said, for this particular patch, let's go with allowing it on arbitrary statements as GCC is doing. We can always revisit the decision later if we find an intractable problem.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85091/new/

https://reviews.llvm.org/D85091



More information about the llvm-commits mailing list