[PATCH] D135551: [clang] replace `assert(0)` with `llvm_unreachable` NFC

Tue Oct 11 08:54:22 PDT 2022

aaron.ballman added a comment.

In D135551#3847962 <https://reviews.llvm.org/D135551#3847962>, @dblaikie wrote:

> In D135551#3847607 <https://reviews.llvm.org/D135551#3847607>, @aaron.ballman wrote:
>
>> In D135551#3847444 <https://reviews.llvm.org/D135551#3847444>, @dblaikie wrote:
>>
>>> I thought this was settled quite a while ago and enshrined in the style guide: https://llvm.org/docs/CodingStandards.html#assert-liberally
>>>
>>> `assert(0)` should not be used if something is reachable. We shouldn't have a "this violates an invariant, but if you don't have asserts enabled you do get some maybe-acceptable behavior".
>>>
>>> I feel fairly strongly that any cases of "reachable asserts" should be changed to valid error handling or `llvm_report_error` and remaining `assert(0)` should be transformed to `llvm_unreachable`. (also, ideally, don't use branch-to-`unreachable` where possible, instead assert the condition - in cases where the `if` has side effects, sometimes that's the nicest way to write it, but might be clearer, if more verbose to use a local variable for the condition, then assert that the variable is true (and have the requisite "variable might be unused" cast))
>>
>> I would be okay with that, but that's not what this patch was doing -- it was changing `assert(0)` into an `llvm_unreachable` more mechanically, and I don't think that's a valid transformation. The key, to me, is not losing the distinction between "reaching here is a programming mistake that you'd make during active development" vs "we never expect to reach this patch and want to optimize accordingly."
>
> I don't really think those are different things, though. Violating invariants is UB and there's no discussion to be had about how the program (in a non-asserts build) behaves when those invariants are violated - all bets are off, whether it's assert or unreachable.

I think they're the same thing in terms of runtime behavior, but I feel (rather strongly) that they're different in terms of documentation when reading the source. This code pattern exists and keeps coming up year after year, which is sufficient to inform me that the community thinks there is *some* distinction to be made there. Also, the fact that we have report_fatal_error *and* unreachable APIs signals that we already understand there's a distinction between "reaching here will never happen; optimize accordingly" and "reaching here is a surprising mistake".

>> `__builtin_unreachable` changes the debugging landscape far too much for me to want to see folks using it for "reaching here is a programming mistake" situations, *especially* in RelWithDebInfo builds where optimizations are enabled and may result in surprising call stacks and time travel debugging.
>
> Generally LLVM's pretty hard to fathom in a non-asserts build anyway, right? (that's the first thing any of us do is reproduce with an assertions build that may fail miles away from where a crash occurred because an invariant was violated much earlier) - that `cast` won't crash/will continue on happily in a non-asserts build seems like a much larger hole to debuggability of a non-asserts build than any unreachable?

This might be true -- personally, I tend to only use debug builds with MSVC because RelWithDebInfo isn't sufficient for my daily needs. However, I've definitely heard of folks who use RelWithDebInfo for their daily work (RelWithDebInfo + Asserts specifically, IIRC) because of the improved build times and runtime performance; we should be sure we're not disrupting that workflow too much.

>>>> Historically, we've usually used llvm_unreachable for situations where we're saying "this code cannot be reached; if it can, something else has gone seriously wrong." For example, in code like: int foo(SomeEnum E) { switch (E) { case One: return 1; default: return 2; } llvm_unreachable("getting here would be mysterious"); } and we've used assert(0) for situations where we're saying "this code is possible to reach only when there were mistakes elsewhere which broke invariants we were relying on."
>>>
>>> I don't think those are different things though - violating invariants is ~= something going seriously wrong.
>>>
>>> (sorry, I guess I should debate this on the thread instead of here - but I think most of the folks on that thread did agree with the LLVM style guide/the direction here)
>>>
>>> This, I think was also discussed about a decade ago in the LLVM community and resulted in r166821/2962d9599e463265edae599285bbc6351f1cc0ef which specifically "Suggests llvm_unreachable over assert(0)" and is the policy of LLVM - this change is consistent with that policy.
>>
>> I don't have the context for those changes in my email either, but regardless of what we thought ten years ago, we have code in the code base today that assumes a difference in severity between kinds of unreachable statements so we need to be careful when correcting mistakes. I think we're still in agreement that `llvm_unreachable` should be preferred over `assert(0)` in situations where the code is expected to be impossible to reach. I think we're also still in agreement that "correct error reporting" is preferred over `assert(0)`. Where we might still have daylight are the occasional situations where `assert(0)` gives a better experience -- when the code is possible to reach but reaching it signifies a developer (not user) mistake that is plausible to make when doing new development, such as when misusing the C interface (where there isn't always an error that should be reported via the API). I think these uses should generally be rare, but I don't think it's a "never do this" kind of situation.
>
> That still seems like something that should be caught by an asserts build and is UB otherwise. If you're developing against LLVM's APIs in a non-assertions build, there's a lot of other invariants that won't be checked (`cast` being a pretty core example) and will make debugging really difficult.
>
>> `llvm_unreachable` asserts in debug mode, so it has the nice property we need when doing new development. But it doesn't convey the distinction in semantics. Maybe another approach is: `#define developer_bugcheck(msg) llvm_unreachable(msg)` (or something along those lines). We still get the assertion in debug mode, we then get the better optimization in release mode, but we don't lose the semantic information in the code base. It doesn't really help the RelWithDebInfo case though (where call stacks may be unrecognizable as a result of time travel optimizations) but maybe that's a tradeoff worth making?
>
> I don't really see the distinction, though - it's a violated invariant, which is a developer bug, whether it's from an assert on unreachable. They're both expressing invariants - not of different strength. "Valid code should never reach this branch" and "valid code should never produce a false result here".

There are different kinds of developer bugs and it's reasonable for people to want to express them differently, IMO. Concretely with a contrived example:

  enum E { Zero, One, Two };

  int func(E Val) {
    switch (Val) {
    case Zero: return 12;
    case One: return 200;
    case Two: return -1;
    }
    llvm_unreachable("never get here");
  }

This is a case where the code is technically reachable (someone can call `func((E)3)`) but the programmer's intent it "nobody will ever do that" because it's an internal API, not exposed to C, other safeguards like diagnostics ensure it, or whatever. Contrast this with:

  enum E { Zero, One, Two };

  extern "C" int func(unsigned Val) {
    switch (Val) {
    case Zero: return 12;
    case One: return 200;
    case Two: return -1;
    }
    assert(0 && "never get here");
  }

Same general situation, but now the safeguards are no longer in place. The function is externally callable by anyone who wants to use it, the interface has lost the connection to the enumeration, etc. So the code is also technically reachable, but the author wants to signal the difference in plausibility of reaching the erroneous state.

I definitely agree that the end results of reaching the unreachable bits are the same in terms of what happens at runtime. But I'm worried about maintaining the code -- losing the distinction between the two situations makes the person debugging the failure have to figure out whether someone missed adding a safeguard elsewhere vs something more difficult to solve such as a miscompile from the host implementation, stack stomping, etc.

I don't care whether we spell it `assert(0)` or `report_fatal_error()` or something else. I care that we don't have a policy requiring folks to not make a distinction between the two scenarios aside from leaving comments in the code as I think that's a step backwards.

> I'd really like to avoid what I see as cleanup - replacing one construct (assert(false)) with another (llvm_unreachable) of identical (to me) contractual strength - being slowed because of this distinction. Replacing reachable-unreachables, or reachable-false-assertions with llvm_report_error, to me, is orthogonal to this cleanup.

Understood. I'm pushing back on this being a cleanup in all cases.

> (& for the C API - I think assertions/unreachable are the right thing, not llvm_report_error - I'm sure there's many more ways to violate the C API contract that would result in inexplicable/confusing behavior and that will only ever be protected by assertions, than we might cover by a few llvm_report_errors on the interface & that should be the direction we encourage people to go down - develop with assertions enabled)
>
> Basically llvm_report_error ends up/should be a "we couldn't find a better way to do error handling here, but it is really a reachable error we have to do something with" - each one is a bug in the library-ness of LLVM. (or it's used up in a tool implementation (above the library layer) and then it's fine/just a convenient way to error/exit, though probably still isn't so good for diagnostic quality from such tools). If it's not reachable by a correct usage (be it from a command line tool or API) then it should be an assertion, not an llvm_report_error.
>
> But I know this issue comes up from time to time and I've yet to figure out a way to come to shared understandings on it :/ so I don't expect my rather narrow and firm definition to necessarily carry the day on this.

Concrete guidance around reachability that I would be happy with is:

- Prefer a user-facing diagnostic in any circumstance that's under the user's direct control (through command line options, source code, etc); users should not get failed assertions/crashes as a form of error reporting.
- Prefer `assert(condition)` in any circumstance under which there is a condition to be checked to validate the state of the system. The only use of a literal in such an assertion should be a string literal to use as a message. e.g., no `assert(0);`
- Prefer `llvm_report_error()` in any circumstance under which a code path is functionally possible to reach, but only in erroneous executions that signify a mistake on the part of the LLVM developer elsewhere in the program.
- Prefer `llvm_unreachable()` in any circumstance under which a code path is believed to be functionally impossible to reach (even if technically possible to reach). The API is now self-documenting to mean "this code really should be totally unreachable".

Something is functionally possible to reach when there are missing safeguards elsewhere that should have prevented calling the function in that state. Something is functionally impossible to reach when miscompiles, undefined behavior, or malicious users modifying memory with a debugger (etc) are the realistic ways to reach that state. When in doubt as to whether something is functionally possible to reach or not, use your best judgement but prefer `llvm_report_error` on the assumption that most reachability mistakes are LLVM developer errors rather than "extenuating circumstance" kind of situations.

That said, I have no idea if I'm being too pedantic here. I'm basing this off my own experiences with the code base as well as internal discussions with other folks at Intel working on the project and what their expectations are (both as new-to-the-code-base folks and long-time contributors), but this is a pretty small selection of people.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135551/new/

https://reviews.llvm.org/D135551