[clang] 4a0267e - Convert a reachable llvm_unreachable into an assert.

Tue Mar 24 13:08:30 PDT 2020

Thank you for the discussion on IRC about this topic. For reference,
this generated https://reviews.llvm.org/D76721 to clarify the
documentation. I've also reverted the change in this thread in commit
7339fca25facb566e969b6ce01f23ac96499d574, so we're back to
llvm_unreachable in this case.

~Aaron

On Mon, Mar 23, 2020 at 6:36 PM David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
> On Mon, Mar 23, 2020 at 5:24 AM Aaron Ballman <aaron at aaronballman.com> wrote:
>>
>> On Sun, Mar 22, 2020 at 3:31 PM David Blaikie <dblaikie at gmail.com> wrote:
>> >
>> > On Sun, Mar 22, 2020 at 9:34 AM Aaron Ballman <aaron at aaronballman.com> wrote:
>> >>
>> >> On Sun, Mar 22, 2020 at 12:19 PM David Blaikie <dblaikie at gmail.com> wrote:
>> >> >
>> >> > On Sun, Mar 22, 2020 at 6:34 AM Aaron Ballman <aaron at aaronballman.com> wrote:
>> >> >>
>> >> >> On Sat, Mar 21, 2020 at 11:31 PM David Blaikie <dblaikie at gmail.com> wrote:
>> >> >> >
>> >> >> > Why the change? this seems counter to LLVM's style which pretty consistently uses unreachable rather than assert(false), so far as I know?
>> >> >>
>> >> >> We're not super consistent (at least within Clang), but the rules as
>> >> >> I've generally understood them are to use llvm_unreachable only for
>> >> >> truly unreachable code and to use assert(false) when the code is
>> >> >> technically reachable but is a programmer mistake to have gotten
>> >> >> there.
>> >> >
>> >> >
>> >> > I don't see those as two different things personally - llvm_unreachable is used when the programmer believes it to be unreachable (not that it must be proven to be unreachable - we have message text there so it's informative if the assumption turns out not to hold)
>> >>
>> >> The message text doesn't help when the code is reached in release
>> >> mode, which was the issue. Asserts + release you still get some
>> >> helpful text saying "you screwed up". llvm_unreachable in release
>> >> mode, you may get lucky or you may not (in this case, I didn't get
>> >> lucky -- there was no text, just a crash).
>> >
>> >
>> > That doesn't seem to be what it's documented to do:
>> >
>> > /// Marks that the current location is not supposed to be reachable.
>> > /// In !NDEBUG builds, prints the message and location info to stderr.
>> > /// In NDEBUG builds, becomes an optimizer hint that the current location
>> > /// is not supposed to be reachable.  On compilers that don't support
>> > /// such hints, prints a reduced message instead.
>> >
>> > & certainly I think the documentation is what it /should/ be doing.
>> >
>> > /maybe/ https://reviews.llvm.org/rG5f4535b974e973d52797945fbf80f19ffba8c4ad broke that contract on Windows, but I'm not sure how? (an unreachable at the end of that function shouldn't cause the whole function to be unreachable - because abort could have side effects and halt the program before the unreachable is reached)
>>
>> Agreed. It could also be that my machine is in a weird state (I'm
>> currently battling a situation where the command line parser appears
>> to be totally broken on Windows due to misuse of a ManagedStatic
>> somewhere but I've not seen any commits that relate to the issues).
>>
>> >> >> In this particular case, the code is very much reachable
>> >> >
>> >> >
>> >> > In what sense? If it is actually reachable - shouldn't it be tested? (& in which case the assert(false) will get in the way of that testing)
>> >>
>> >> In the sense that normal code paths reach that code easily. Basically,
>> >> that code is checking to see whether a plugin you've written properly
>> >> sets up its options or not. When you're developing a plugin, it's
>> >> quite reasonable to expect you won't get it just right on the first
>> >> try, so you hit the code path but only as a result of you not writing
>> >> the plugin quite right. So under normal conditions (once the plugin is
>> >> working), the code path should not be reached but under development
>> >> the code path gets reached accidentally.
>> >>
>> >> >> and I
>> >> >> spent a lot more time debugging than I should have because I was using
>> >> >> a release + asserts build and the semantics of llvm_unreachable made
>> >> >> unfortunate codegen (switching to an assert makes the issue
>> >> >> immediately obvious).
>> >> >
>> >> >
>> >> > I think it might be reasonable to say that release+asserts to have unreachable behave the same way unreachable behaves at -O0 (which is to say, much like assert(false)). Clearly release+asserts effects code generation, so there's nothing like the "debug info invariance" goal that this would be tainting, etc.
>> >> >
>> >> > Certainly opinions vary here, but here are some commits that show the sort of general preference:
>> >> > http://llvm.org/viewvc/llvm-project?view=revision&revision=259302
>> >> > http://llvm.org/viewvc/llvm-project?view=revision&revision=253005
>> >> > http://llvm.org/viewvc/llvm-project?view=revision&revision=251266
>> >> >
>> >> > And some counter examples:
>> >> > http://llvm.org/viewvc/llvm-project?view=revision&revision=225043
>> >> > Including this thread where Chandler originally (not sure what his take on it is these days) expressed some ideas more along your lines:
>> >> > http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20110919/thread.html#46583
>> >> >
>> >> > But I'm always pretty concerned about the idea that assertions should be used in places where the behavior of the program has any constraints when the assertion is false...
>> >>
>> >> I'm opposed to using unreachability hints on control flow paths you
>> >> expect to reach -- the semantics are just plain wrong, even if you can
>> >> get the same end result of controlled crash + message. In this
>> >> particular case, the code is reachable but erroneous to do so -- and
>> >> that's what assertions are intended to be used for. My preference is
>> >> to use llvm_unreachable because the code is unreachable, not because
>> >> the code should not be reachable only if everything works out right.
>> >>
>> >> It may be that we're not in agreement about the definition of "expects
>> >> to reach" here. To me, this code clearly expects to reach that path:
>> >> it's a search over a finite collection where it's possible the thing
>> >> being searched for is not found. The "we didn't find the thing we were
>> >> expecting to find" assertion is reasonable because this isn't the
>> >> result of end-user error (then we'd fire a diagnostic instead) but is
>> >> the result of a plugin author's error. If the collection and the input
>> >> to the search were both fully under control of the analyzer (so the
>> >> search cannot fail without exceptional circumstances), then I think
>> >> llvm_unreachable could be reasonable.
>> >
>> >
>> > Ah, OK - my approach is generally that programmer errors are programmer errors, whether the mistake is in LLVM code or in code using LLVM and in all cases asserts and unreachable express an intent about the invariants of the code - ie: any violation of them represents a bug where the fix is changing the code (either LLVM code or client code).
>> >
>> > I think in both cases (LLVM internal developers and LLVM external/client developers) we should do what we can to make those failures actionable with an asserts build & I think unreachable is at least /intended/ to provide that functionality when an intended-to-be-unreachable path is mistakenly reached for any reason.
>>
>> Let's ignore the behavioral issues, which we're agreed should behave
>> consistently with assert(false) in a release+asserts build. If there
>> are lingering issues here, I can look into fixing them.
>>
>> What I think we need to clarify in our public documentation is whether
>> reachable code should be marked with llvm_unreachable or not, because
>> it's not clear from the docs or the API itself. My personal position
>> is: do not use llvm_unreachable on code that is possible to reach
>> through typical program control flow. e.g., if there's a valid way to
>> execute the tool such that you hit that code path (so the control flow
>> path is not a bug)
>
>
> I'm not sure I understand the "valid way to execute the tool" - if it's code bug to execute this way, I would define that as "invalid" & if it fails an assertion I'd say that's "invalid".
>
> Much like if user code passed in a null pointer - we could assert it's non-null, but we'd continue on to dereference it, so it's not valid to pass in non-null values as it sounds like it's not valid to getChecker*Option that doesn't meet the CheckerRegistry's validation? That's part of the contract of this function, by the looks of it (by the presence of this assertion)
>
>>
>> then llvm_unreachable is the wrong tool to reach
>> for because the name causes confusion (the name implies "ignore this,
>> the code cannot matter" but the code can still be executed so it has
>> security implications, etc).
>
>
> Not sure I follow - there should be no code in an unreachable branch, because it can/will be optimized away - you can't put security back-stop code in an unreachable block, it won't be preserved.
>
>>
>> If the community consensus is that we do
>> want to use llvm_unreachable in this sort of case, then I think we
>> should rename llvm_unreachable to something more clear, like
>> llvm_bug_if_reached (name can be bikeshed).
>
>
> I'm open to renaming it, though I think that'll be a bigger discussion on llvm-dev and I'm not sure.
>
> Though I'm trying to understand - what did the original llvm_unreachable mean to you that was different from what llvm_bug_if_reached would've communicated to you in that context?
>
>>
>> Either way, I think we
>> should clarify the developer docs to make an explicit statement about
>> this. WDYT?
>
>
> Likely, yeah - I'd be happy to make it more explicit/clear about what these are for. (could cover things like "don't write "if (X) unreachable", instead assert(!X)" which comes up sometimes too)
>
> - Dave
>
>>
>>
>> ~Aaron
>>
>>
>> >
>> > - Dave
>> >
>> >>
>> >>
>> >> ~Aaron
>> >>
>> >> >
>> >> > - Dave
>> >> >
>> >> >>
>> >> >>
>> >> >> >
>> >> >> > On Tue, Mar 10, 2020 at 11:22 AM Aaron Ballman via cfe-commits <cfe-commits at lists.llvm.org> wrote:
>> >> >> >>
>> >> >> >>
>> >> >> >> Author: Aaron Ballman
>> >> >> >> Date: 2020-03-10T14:22:21-04:00
>> >> >> >> New Revision: 4a0267e3ad8c4d47f267d7d960f127e099fb4818
>> >> >> >>
>> >> >> >> URL: https://github.com/llvm/llvm-project/commit/4a0267e3ad8c4d47f267d7d960f127e099fb4818
>> >> >> >> DIFF: https://github.com/llvm/llvm-project/commit/4a0267e3ad8c4d47f267d7d960f127e099fb4818.diff
>> >> >> >>
>> >> >> >> LOG: Convert a reachable llvm_unreachable into an assert.
>> >> >> >>
>> >> >> >> Added:
>> >> >> >>
>> >> >> >>
>> >> >> >> Modified:
>> >> >> >>     clang/lib/StaticAnalyzer/Core/AnalyzerOptions.cpp
>> >> >> >>
>> >> >> >> Removed:
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> ################################################################################
>> >> >> >> diff  --git a/clang/lib/StaticAnalyzer/Core/AnalyzerOptions.cpp b/clang/lib/StaticAnalyzer/Core/AnalyzerOptions.cpp
>> >> >> >> index 01ac2bc83bb6..99e16752b51a 100644
>> >> >> >> --- a/clang/lib/StaticAnalyzer/Core/AnalyzerOptions.cpp
>> >> >> >> +++ b/clang/lib/StaticAnalyzer/Core/AnalyzerOptions.cpp
>> >> >> >> @@ -134,9 +134,9 @@ StringRef AnalyzerOptions::getCheckerStringOption(StringRef CheckerName,
>> >> >> >>      CheckerName = CheckerName.substr(0, Pos);
>> >> >> >>    } while (!CheckerName.empty() && SearchInParents);
>> >> >> >>
>> >> >> >> -  llvm_unreachable("Unknown checker option! Did you call getChecker*Option "
>> >> >> >> -                   "with incorrect parameters? User input must've been "
>> >> >> >> -                   "verified by CheckerRegistry.");
>> >> >> >> +  assert(false && "Unknown checker option! Did you call getChecker*Option "
>> >> >> >> +                  "with incorrect parameters? User input must've been "
>> >> >> >> +                  "verified by CheckerRegistry.");
>> >> >> >>
>> >> >> >>    return "";
>> >> >> >>  }
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> _______________________________________________
>> >> >> >> cfe-commits mailing list
>> >> >> >> cfe-commits at lists.llvm.org
>> >> >> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits