[cfe-commits] False positive for -Wunreachable-code
Jordan Rose
jordan_rose at apple.com
Tue Oct 30 11:32:27 PDT 2012
On Oct 30, 2012, at 11:19 , David Blaikie <dblaikie at gmail.com> wrote:
> On Tue, Oct 30, 2012 at 11:08 AM, Abramo Bagnara
> <abramo.bagnara at bugseng.com> wrote:
>> Il 30/10/2012 18:49, David Blaikie ha scritto:
>>> On Tue, Oct 30, 2012 at 10:39 AM, Abramo Bagnara
>>> <abramo.bagnara at bugseng.com> wrote:
>>>> Il 30/10/2012 18:25, David Blaikie ha scritto:
>>>>> On Tue, Oct 30, 2012 at 10:19 AM, Jordan Rose <jordan_rose at apple.com> wrote:
>>>>>>
>>>>>> On Oct 30, 2012, at 10:17 , David Blaikie <dblaikie at gmail.com> wrote:
>>>>>>
>>>>>>> On Tue, Oct 30, 2012 at 9:25 AM, Jordan Rose <jordan_rose at apple.com> wrote:
>>>>>>>>
>>>>>>>> On Oct 30, 2012, at 2:34 , Abramo Bagnara <abramo.bagnara at bugseng.com> wrote:
>>>>>>>>
>>>>>>>>> Il 28/10/2012 08:12, Abramo Bagnara ha scritto:
>>>>>>>>>> $ cat p.c
>>>>>>>>>> #include <stdio.h>
>>>>>>>>>>
>>>>>>>>>> enum e { a, b = 4 } x = 3;
>>>>>>>>>>
>>>>>>>>>> void g(int v) {
>>>>>>>>>> printf("%d\n", v);
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> int main(int argc, char **argv) {
>>>>>>>>>> switch (x) {
>>>>>>>>>> case a:
>>>>>>>>>> g(0);
>>>>>>>>>> break;
>>>>>>>>>> case b:
>>>>>>>>>> g(1);
>>>>>>>>>> break;
>>>>>>>>>> default:
>>>>>>>>>> g(2);
>>>>>>>>>> break;
>>>>>>>>>> }
>>>>>>>>>> }
>>>>>>>>>> $ _clang -Wunreachable-code -Wcovered-switch-default -O2 p.c
>>>>>>>>>> p.c:17:3: warning: default label in switch which covers all enumeration
>>>>>>>>>> values
>>>>>>>>>> [-Wcovered-switch-default]
>>>>>>>>>> default:
>>>>>>>>>> ^
>>>>>>>>>> p.c:18:7: warning: will never be executed [-Wunreachable-code]
>>>>>>>>>> g(2);
>>>>>>>>>> ^
>>>>>>>>>> $ ./a.out
>>>>>>>>>> 2
>>>>>>>>>>
>>>>>>>>>> Of course -Wcovered-switch-default warning is a perfectly true positive.
>>>>>>>>>>
>>>>>>>>>> My reading of the standard is that nothing prevent an enum to have a
>>>>>>>>>> value different from listed enum constants if this value is compatible
>>>>>>>>>> with enum range (and code generation seems to agree on that).
>>>>>>>>>
>>>>>>>>> I've attached the patch for review.
>>>>>>>>>
>>>>>>>>> The fixed testcase shows well why to hide warnings about undefined
>>>>>>>>> behaviour in code actually generated is a bad thing.
>>>>>>>>
>>>>>>>> If we do this, we're going to want this under a CFG option at the very least. The static analyzer should continue assuming that an enum input to a switch is always a valid enum constant, in order to keep our false positive rate down.
>>>>>>>
>>>>>>> Yeah, I doubt this'll be any better in the compiler proper, really.
>>>>>>> The heuristic exists to, as you rightly point out, reduce false
>>>>>>> positives & that rationale exists in the compiler as well.
>>>>>>>
>>>>>>> While, yes, it means we lose some true positives as well, that
>>>>>>> probably isn't worth the increase in false positives.
>>>>>>
>>>>>> I can see Abramo's point, however, that in a more defensive coding style the current -Wunreachable warning can easily be considered a false positive. We don't optimize out the default case in an enum-covered switch.
>>>>>
>>>>> Flagging this as unreachable code is a bug & should be fixed - but
>>>>> probably in the way I described. Treating it purely as reachable code
>>>>> & emitting our 'runtime' diagnostics for code in such situations will
>>>>> (I believe - though I haven't run numbers) increase false positive
>>>>> rates substantially.
>>>>>
>>>>> A trivial example that GCC often warns about & Clang deliberately does not:
>>>>>
>>>>> int func(enum X v) {
>>>>> switch (v) {
>>>>> case A: return 1;
>>>>> case B: return 2;
>>>>> ... // fully covered
>>>>> case Z: return 26;
>>>>> }
>>>>> // GCC warns that the function may exit without a return value here,
>>>>> Clang does not
>>>>> // the LLVM/Clang codebase has a lot of llvm_unreachables after
>>>>> fully covered/exiting
>>>>> // switches like this to silence GCC's warning. Each of those is,
>>>>> essentially, a GCC
>>>>> // false positive (in the sense that the code is not buggy).
>>>>> }
>>>>
>>>> :-o
>>>>
>>>> Unless I'm missing something, the code is definitely buggy and leads to
>>>> undefined behaviour in C++ entering with v = Z + 1. Note that entering
>>>> with v = Z + 1, is not per se an undefined behavior: only the missing
>>>> return causes that.
>>>>
>>>> Can we at least agree on that?
>>>
>>> Yes & no. Yes a program exhibits UB if the function is called with v =
>>> Z + 1, no the code is not (necessarily) buggy if the function is never
>>> called with such invalid values.
>>>
>>> If code is written in such a way as to not violate that invariant,
>>> then the warning is a false positive (it has not found a bug in the
>>> code). If people often write code with this invariant then the false
>>> positive rate is "high" and the true positive rate is "not so high",
>>> so we try to avoid warning & producing more noise than good advice.
>>> (it's obviously not a 1:1 ratio, and it's certainly a judgement call)
>>>
>>>> If we'd agree on that we will easily discover that my proposed change
>>>> does not lead to any false positive diagnostics, that GCC is right and
>>>
>>> Your definition of "false positive" differs from mine/ours. Hopefully
>>> my description above helps describe the motivation here.
>>
>> Yes, my definition of false positive is different, see:
>>
>> http://en.wikipedia.org/wiki/Type_I_and_type_II_errors#False_positive_error
>>
>> "is the default unreachable"? ("is there a wolf near the herd?")
>>
>> If the message says "warning: will never be executed
>> [-Wunreachable-code]" ("Wolf, wolf!") then we have a false positive.
>>
>> The idea that a warning is a false positive if it has not found a bug in
>> the code is rather weird... do you know any warning that is able to
>> always find a bug without knowing the programmer intentions?
>
> Generally, no. They make local assumptions so to make a best effort at
> finding bugs. Part of that is also to avoid finding non-bugs because
> doing so creates a burden on developers that may eventually eclipse
> the benefit they gain from the warning.
>
> We're not trying to write warnings to teach people about the semantics
> of their code - they can read books about that. They're meant to be
> actionable for a good reason, not just to indicate that the developer
> read & understood the diagnostic message & then went on their merry
> way.
We try to guess people's intentions all the time. -Wmissing-semi will insert the semicolon and continue compiling as if it had been there. -Wparentheses is a controversial style warning, but it catches real bugs. We don't warn about tautological comparisons in templates because many times the template itself isn't tautological, only the instantiation.
When talking about the analyzer I've sometimes called behavior like this "pedantically true"—cases where something could happen and we have no way to check it within this translation unit, but it most likely won't and programmers shouldn't have to spend time thinking about it. That applies in both directions: we shouldn't give a -Wreturn-type warning if there's a fully-covered enum switch, but we shouldn't call it out as being unreachable either. So I agree with your latest e-mail in the thread.
(The llvm_unreachable does allow for optimization that we wouldn't get otherwise, so we probably should stop calling it a sop to GCC.)
>
>> What about enum designed to be used to represent a mask?
>
> They happen, certainly. I'm not at all claiming that people never
> deliberately put values outside the enum constants into an enum value.
This is legal in C++11 and probably in C11 as well, but for what it's worth, doing a switch on a value whose type is an enum of bit-flags is probably almost always the wrong thing to do.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20121030/b5671168/attachment.html>
More information about the cfe-commits
mailing list