[PATCH] D146466: [clang] diagnose function fallthrough

Thu Mar 23 10:02:55 PDT 2023

aaron.ballman added a comment.

In D146466#4213814 <https://reviews.llvm.org/D146466#4213814>, @nickdesaulniers wrote:

> In D146466#4210607 <https://reviews.llvm.org/D146466#4210607>, @aaron.ballman wrote:
>
>> In D146466#4208216 <https://reviews.llvm.org/D146466#4208216>, @dblaikie wrote:
>>
>>> If the issue is with the actual fallthrough - that's been discussed elsewhere recently, and I think it'd be reasonable to turn on TrapOnUnreachable everywhere, basically (with NoTrapAfterNoreturn too, perhaps, to keep the cost low) so that these trivial fallthroughs hit int3 instead of going on to the next function.
>>
>> As I understand things, folks are confused about the runtime behavior of falling through and magically calling some other function. If we can warn about that situation from the FE without false positives, then huttah, I think that's sufficient. But I don't think we know of a way to do that from the FE, so I think ensuring a runtime trap is more helpful to folks than the status quo. (Also, in general, I think we want to avoid emitting diagnostics from the backend -- that has usability issues where clean `-fsyntax-only` builds suddenly get warnings when doing full codegen, or in circumstances where the optimization level changes the diagnostic output, etc.)
>
> The issue is that there are problems that may occur as a result of optimizations; if the front end does not do optimizations, it cannot diagnose these.  Some examples include:
>
> - `-Wframe-larger-than=` (useful for embedded systems without "nearly unlimited" stack, though an imperfect proxy for maximal stack depth, which is what these systems actually want. Helpful for spotting large stack allocations that should come from the heap or static memory).
> - `-Wattribute-warning` (some implementations of `_FORTIFY_SOURCE` such as the Linux kernel's rely on this; specifically that calls to functions with these attributes get optimized out. This provides //significant// protections during //compile time// when //due to optimizations// we have an out of bounds access, when compared to implementations that simply use `_Static_assert`)
> - `-Wfunction-fallthrough` (hypothetical new diagnostic this patch is adding)
>
> Even if Clang had a high level IR and did optimizations, it still could not guarantee the results of LLVM's subsequent optimizations.
>
> Does that mean that `-Wframe-larger-than=` (and the like) should not exist because it's not diagnosable via `-fsyntax-only`?
>
> I //do// think that there are a class of diagnostics that are useful to users that //depend// on optimizations being run.  While a policy around new diagnostics being diagnosable via `-fsyntax-only` is a simplistic litmus test, I do think it allows for our competition to implement diagnostics that may be more useful to users.  This hamstrings our competitiveness, IMO.

Okay, was not very thorough in explaining my concerns about diagnostics coming out of the optimizer. `-fsyntax-only` behavior is one facet to the issue, but the problem boils down to "mysterious compiler behavior" as far as the user is concerned. Other examples of this same problem are:

- User is compiling everything locally with their debug build (-O0) and everything compiles fine. They commit it and it gets built as a release build (-O2 or -O3), and now they get alerted to problems.
- User is compiling everything locally with their optimized build and everything compiles fine. They commit it and it gets built on another machine with significantly more resources than the developer's machine had, so optimization decisions are different and not they get alerted to problems.
- User is compiling everything locally with their optimized build and everything compiles fine, they close their web browser, fix a typo in a comment, and recompile and now they get alerted to problems.
- Any use of `-Werror` with optimization-based diagnostics is effectively playing compiler russian roulette.

The root of the problem is that we want diagnostics to be reproducible, and diagnostics generated by the optimizer often are not reliably reproducible. GCC has already gone down this path, and it causes confusion in practice:

https://stackoverflow.com/questions/59388740/gcc-shows-different-warnings-depending-on-optimisation-level
https://stackoverflow.com/questions/65049644/gcc-warning-about-unitialized-values-caused-by-turning-optimization-on
https://developers.redhat.com/blog/2019/03/13/understanding-gcc-warnings-part-2#

That's why we've traditionally limited the backend to producing remarks instead of warnings or errors. It's a tradeoff between user experience decisions. That's not to say there's never a valid reason to diagnose from the backend, but I think we need to be very mindful about the user experience when we add them. Linux kernel developers are one class of user (and likely don't have too much confusion over optimization-based diagnostics), but we've got plenty of users for whom the compiler is effectively a black box, and those are the folks I worry about with optimization-based diagnostics.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146466/new/

https://reviews.llvm.org/D146466