[cfe-commits] False positive for -Wunreachable-code

Wed Oct 31 11:13:48 PDT 2012

On Wed, Oct 31, 2012 at 11:04 AM, Abramo Bagnara
<abramo.bagnara at bugseng.com> wrote:
> Il 31/10/2012 18:44, David Blaikie ha scritto:
>> On Wed, Oct 31, 2012 at 10:41 AM, Abramo Bagnara
>> <abramo.bagnara at bugseng.com> wrote:
>>> Il 31/10/2012 18:11, David Blaikie ha scritto:
>>>> On Wed, Oct 31, 2012 at 12:22 AM, Abramo Bagnara
>>>> <abramo.bagnara at bugseng.com> wrote:
>>>>> Il 30/10/2012 19:19, David Blaikie ha scritto:
>>>>>>>>
>>>>>>>> Your definition of "false positive" differs from mine/ours. Hopefully
>>>>>>>> my description above helps describe the motivation here.
>>>>>>>
>>>>>>> Yes, my definition of false positive is different, see:
>>>>>>>
>>>>>>> http://en.wikipedia.org/wiki/Type_I_and_type_II_errors#False_positive_error
>>>>>>>
>>>>>>> "is the default unreachable"? ("is there a wolf near the herd?")
>>>>>>>
>>>>>>> If the message says "warning: will never be executed
>>>>>>> [-Wunreachable-code]" ("Wolf, wolf!") then we have a false positive.
>>>>>>>
>>>>>>> The idea that a warning is a false positive if it has not found a bug in
>>>>>>> the code is rather weird... do you know any warning that is able to
>>>>>>> always find a bug without knowing the programmer intentions?
>>>>>>
>>>>>> Generally, no. They make local assumptions so to make a best effort at
>>>>>> finding bugs. Part of that is also to avoid finding non-bugs because
>>>>>> doing so creates a burden on developers that may eventually eclipse
>>>>>> the benefit they gain from the warning.
>>>>>>
>>>>>> We're not trying to write warnings to teach people about the semantics
>>>>>> of their code - they can read books about that. They're meant to be
>>>>>> actionable for a good reason, not just to indicate that the developer
>>>>>> read & understood the diagnostic message & then went on their merry
>>>>>> way.
>>>>>>
>>>>>>> That apart, of course if we want a compilation switch that deviates from
>>>>>>> the standard and says that an enum typed expression cannot have any
>>>>>>> value different from enum constants specified we can do that, but really
>>>>>>> we want that?
>>>>>>
>>>>>> Not really, no - the compiler implements the standard. For the purpose
>>>>>> of diagnostics we might be interested in adding an attribute to enum
>>>>>> types or variables that could indicate whether that variable or
>>>>>> variables of that type are intended to contain an/all values in the
>>>>>> representable range (to cause us to do things like diagnose the code
>>>>>> after the switch or in the default of a covered switch as reachable,
>>>>>> etc).
>>>>>
>>>>> In the long term this seems indeed the right way to handle that, but I
>>>>> believe that more useful (and in our work of software verifier tools
>>>>> writer indeed mandatory)
>>>>
>>>> Sorry, I don't quite understand this point. If you're implementing a
>>>> verifier on top of Clang then it'll need cross-TU analysis anyway &
>>>> shouldn't need to make these assumptions (ideally it could prove that
>>>> the enum use is safe/never leaks the possibility of lying outside the
>>>> enumerable values but inside the representable range). This might call
>>>> for some API tweaks to the CFGBuilder so that it can take information
>>>> your cross-TU analysis has proven about certain enums, but I don't see
>>>> how a Clang command line option would relate to development of
>>>> software verifier tools. Could you explain in more detail?
>>>
>>> Of course we don't need a command line option, but an option settable
>>> from clang library (e.g. in LangOptions).
>>>
>>> Once this has been done it become natural and savy to interface it to a
>>> clang command line options (although we are interested in that only as
>>> clang compiler users and not for our tools)
>>>
>>> About cross-TU analysis, it is not strictly needed for the unreachable
>>> code check. Of course it increases precision, but is not mandatory.
>>>
>>> The point is that when we rely on info obtained from clang we need to be
>>> sure they does not contains false negative or positives
>>
>> You can't have both though. Without whole program information it's
>> going to have one or the other.
>>
>>> due to
>>> statistical assumptions or arbitrary heuristics (as in the case of a
>>> perfectly reachable default presented as never executed shown in the
>>> mail that has originated this thread).
>
> You've skipped the important part above: the point is not about false
> negative or false positive presence, they often (or always) will be
> there, but about their presence not due to language semantic but due to
> "arbitrary" restrictions of that semantics.
>
> I use "arbitrary" not as a synonim of "random" nor to discredit the
> technical value of the choice of this restrictions: it might be the best
> possible and I definitely believe it is good choice for the declared
> aim, but it is nevertheless a choice that leads to an analysis that
> declare false results (never executed when actually executed).

Not exactly. To reiterate, we all already agree that this
unreachable-code warning is a false positive. We all already agree
that we want to fix it.

The fix you have in mind might be a reasonable workaround for you, but
not the long term plan we have to improve this diagnostic without
producing unnecessary noise in other diagnostics (all the "runtime"
diagnostics). Yes, you can argue that that extra noise are technically
"true positives" in those diagnostics, but that's not really where we
want to go with it. So we'd give people the choice between one evil or
another, which isn't all that helpful compared to fixing unreachable
code the right way without regressing the quality of other
diagnostics.

>
> My proposal is aimed to have the same good heuristic we have now to
> improved signal/noise ratio, but also the possibility to output precise
> info.
>
>>
>> The heuristics aren't arbitrary. They're derived from experience
>> applying these diagnostics to large code bases.
>>
>>>>> is to add a global option (disabled by default)
>>>>> to avoid statistical assumption about programmer intention.
>>>>>
>>>>> (i.e. the option says (among other things): "as the standard does not
>>>>> limit enum typed values to enum constants listed in enum declaration,
>>>>> then this assumption, although if true for 90% of cases, is not taken
>>>>> for granted)
>>>>>
>>>>> This would not harm in any way the use of clang warnings as a way to
>>>>> identify locations that are bug with a probability > 90%, while in the
>>>>> same time makes clang a valid tool to find code weakness according to
>>>>> language standard semantics and to suggest proper defensive programming
>>>>> technique whenever they are needed.
>
>
> --
> Abramo Bagnara
>
> BUGSENG srl - http://bugseng.com
> mailto:abramo.bagnara at bugseng.com