[PATCH] D135551: [clang] replace `assert(0)` with `llvm_unreachable` NFC

Tue Oct 11 13:25:58 PDT 2022

dblaikie added a subscriber: rnk.
dblaikie added a comment.

In D135551#3850391 <https://reviews.llvm.org/D135551#3850391>, @aaron.ballman wrote:

> In D135551#3850308 <https://reviews.llvm.org/D135551#3850308>, @dblaikie wrote:
>
>> In D135551#3850266 <https://reviews.llvm.org/D135551#3850266>, @inclyc wrote:
>>
>>> This makes sense! However I think `assert(0)` should not be used in this case, we could expose another `llvm_unreachable`-like api and probably `llvm_report_error` shall be fine. Are there some changed assertions actually "Aspirationally unreachable" in this patch?
>>
>> No, I really don't think  we should go down that path.
>>
>> I believe these are not actually distinct cases - in either case, the program has UB if they violated the invariants/preconditions - whether or not they called through the C API.
>
> The C Index test cases I commented on earlier in the review are a good example of when there's no UB but we still want to alert people to the problem of code they should not be reaching. The assumption that "reached here unexpectedly" == "UB" is invalid. Some things are logic bugs that exhibit no UB.
>
>> unreachable is no more a guarantee/proven thing than an assertion - both are written by humans and a claim "if this is reached-or-false, there is a bug in some code, somewhere". The statement is not stronger in the unreachable case and the style guide supports that perspective and the way we triage/treat bugs is pretty consistent with that - we get bugs all the time when an unreachable is reached and that doesn't seem to surprise most/anyone - we treat it the same as a bug when an assertion fires.
>>
>> The discourse discussion, I think, supports this ^ perspective.
>>
>> As there's still disagreement, should this escalate to the RFC process to change the style guide, Aaron?
>
> Yes, I would appreciate that. I don't think we're interpreting our policy the same way. Specifically "Use llvm_unreachable to mark a specific point in code that should never be reached."

In the same way that an assert says "This condition should never be false" - I use "should" in the same sense in both unreachable and assert, and I believe that's the prevailing opinion of LLVM developers/the LLVM style guide.

Perhaps we are also at a deadlock as to who should write the proposal... :/

> - "should" is turning out to be interpreted in two ways:
>
> - "used to indicate obligation, duty, or correctness, typically when criticizing someone's actions. e.g., he should have been careful": I am asserting it is impossible to reach this.

This is the "should" ^ I mean, and what every assert should mean too. This code assumes this property to be true - this is a precondition of the code.

We should not be using asserts where we don't mean this. I'm OK assuming every assert does mean "this is a precondition" and treating them that way in terms of transforming them to unreachable or anything else we might do with them - and if some of them don't mean it, then they're buggy and we can fix them, but assert->unreachable doesn't make the situation any worse.

Any code behind/after an assert is untested and unvalidated - we can't say "if you violate this constraint it'll actually be OK" because we've never tested that/don't know that.

> - "used to indicate what is probable. e.g., $348 million should be enough to buy him out": I am asserting you probably won't get here, but you won't be happy if you do.
>
> In D135551#3850266 <https://reviews.llvm.org/D135551#3850266>, @inclyc wrote:
>
>> This makes sense! However I think `assert(0)` should not be used in this case, we could expose another `llvm_unreachable`-like api and probably `llvm_report_error` shall be fine. Are there some changed assertions actually "Aspirationally unreachable" in this patch?
>
> I'm totally fine not using `assert(0)` and using an `llvm_unreachable`-like API (or even using a macro to dispatch to `llvm_unreachable` under a different name).
>
> There are more aspirationally unreachable issues in this patch, I've commented on the ones I spotted, but I stopped commenting pretty quickly because I think a lot of the cases are made slightly worse by switching to `llvm_unreachable` instead of more targeted changes. I'd be especially curious to hear what @dblaikie thinks of the suggestions I have though -- it might be easier to see the distinction with real world code (or it might not!).

Maybe - happy to talk about a few of the examples, but I'm not feeling super optimistic that we'll come to an understanding here, unfortunately :/

@rnk's comment here ( https://discourse.llvm.org/t/llvm-unreachable-is-widely-misused/60587/3 ) pretty well sums up my understanding/values here and it looks like on that thread, mostly the long term LLVM developers agree with this perspective and are trying to explain it to the (so far as I can tell) relative new/outside developer.

================
Comment at: clang/include/clang/AST/Redeclarable.h:265
         if (PassedFirst) {
-          assert(0 && "Passed first decl twice, invalid redecl chain!");
+          llvm_unreachable("Passed first decl twice, invalid redecl chain!");
           Current = nullptr;
----------------
aaron.ballman wrote:
> This looks like it should probably be: `assert(!PassedFirst && "Passed first decl twice, invalid redecl chain!");` rather than an `if` statement with recovery mechanisms.
Yep, I've commented on similar things in the past - not sure we ever got it into the style guide, but "branch to unreachable" should be avoided in favor of assert-the-branch-condition (assuming it has no side effects, or if it does have side effects - do the thing, store the result in a local variable, void cast it to suppress unused-variable warnings and assert it)

but I still think this is a valid transformation - it's just not the whole transformation, so I'm OK with things like this being done mechanically and then cleaned up further (possibly also mechanically) later.

================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h:601
     assert(isa<ObjCMethodDecl>(FD));
-    assert(false && "Getting the type of ObjCMethod is not supported yet");
+    llvm_unreachable("Getting the type of ObjCMethod is not supported yet");

----------------
aaron.ballman wrote:
> This looks like it possibly should have been an error reported to the user? If not, this is a wonderful example of what I mean by aspirationally unreachable code. We aspire to not getting here -- but we can get here just the same and there is not UB as a result (we return a valid object, UB might happen elsewhere based on invalid assumptions of what is returned).
"we" can get here in what sense? Is there source code that can be passed to the static analyzer that can reach this? Then it shouldn't be an assert, it should be exercised and tested - even if that test says "hey, this doesn't do anything good yet". I'd assume, whether I see the assert or the unreachable that it isn't reachable from clang on the command line - and that any code that makes it reachable/calls it under this condition is incorrect (for now, until support is added) and would be considered a bug to be fixed. Whether it's the assert or the unreachable, I consider it to be the same statement of intent.

================
Comment at: clang/lib/AST/ASTImporter.cpp:9976
   else {
-    assert(0 && "CompleteDecl called on a Decl that can't be completed");
+    llvm_unreachable("CompleteDecl called on a Decl that can't be completed");
   }
----------------
aaron.ballman wrote:
> According to our style guide, this probably should have been `assert(isa<ObjCInterfaceDecl, ObjCProtocolDecl, TagDecl>(D) && "CompleteDecl called on a Decl that can't be completed");` but IMO that's worse than using `assert(0)` because it's less maintainable (any time you add a new else if chain you have to also update the assert). I think `llvm_unreachable` is wrong to use here.
Yep, this is a case where branch-to-unreachable might be nicer than contorting the situation into `assert(isa...`) - actually, probably not even that. I'd likely change this code to unconditionally cast, relying on the cast's assertion.

Why is llvm_unreachable wrong here? If you end up with a non-TagDecl here, you're certainly in UB territory - sooner or later, you're going to treat the non-TagDecl as a TagDecl and it's not going to be pretty.

================
Comment at: clang/lib/AST/ExprConstant.cpp:7561-7564
     if (Source == E) {
-      assert(0 && "OpaqueValueExpr recursively refers to itself");
+      llvm_unreachable("OpaqueValueExpr recursively refers to itself");
       return Error(E);
     }
----------------
aaron.ballman wrote:
> Probably should be `assert(Source != E && "OpaqueValueExpr recursively refers to itself");`
Yep, though, again, I think it's a valid transformation even if it doesn't go as far as it could.

================
Comment at: clang/lib/AST/Interp/ByteCodeExprGen.cpp:136-137

   default:
-    assert(false && "Cast not implemented");
+    llvm_unreachable("Cast not implemented");
   }
----------------
aaron.ballman wrote:
> I think this can just be removed, right? There's another unreachable for falling out of the `switch` that seems to be covering the same situation.
Presumably the switch isn't fully covered (otherwise we'd get a `-Wcovered-switch-default` warning).

Might be that the unreachable after the switch can be removed in this case.

================
Comment at: clang/lib/AST/Interp/ByteCodeExprGen.cpp:598
       } else {
-        assert(false && "Unhandled type in array initializer initlist");
+        llvm_unreachable("Unhandled type in array initializer initlist");
       }
----------------
aaron.ballman wrote:
> The rest of the ones here are somewhat interesting in that the interpreter is an experiment under active development and is known to be incomplete. In all of these cases, I think the switch to unreachable is flat-out wrong -- these asserts serve explicitly to find unimplemented cases when we hit them.
& I don't see why unreachable is any different a statement than assert(false) in these cases... - it's the same statement of intent. "if this is reached you've found a bug" (in this case, a missing feature)

But I'd be sort of OK changing all these to report_fatal_error. But, again, I think the assert(false) -> unreachable is a valid transformation and doesn't make anything worse than it already is, but improves the code by being more consistent and removing this confusion that there might be something different about assert(false) when, I believe, there isn't.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135551/new/

https://reviews.llvm.org/D135551