[cfe-commits] Support <x>-to-bool warnings for more contexts

Wed Apr 18 15:42:29 PDT 2012

On Wed, Apr 18, 2012 at 3:04 PM, Nico Weber <thakis at chromium.org> wrote:
> On Wed, Apr 18, 2012 at 11:36 AM, David Blaikie <dblaikie at gmail.com> wrote:
>> On Fri, Apr 6, 2012 at 4:42 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>> On Mon, Apr 2, 2012 at 9:44 PM, Nico Weber <thakis at chromium.org> wrote:
>>>> Do you have any numbers on bug / false positive ratios before and
>>>> after this change?
>>>
>>> I'm surprised this didn't catch more - but I found only 2 cases where
>>> this diagnostic fired (on the same function call, no less) & they seem
>>> like perfectly reasonable true positives. Something like:
>>>
>>> void func(bool, bool);
>>> func(0.7, 0.3);
>>>
>>> I'm not really sure what the author intended, but I'm fairly certain
>>> they didn't get it (unless their intent was to confuse future
>>> readers).
>>
>> So this was a little more positive than it looks - these were the new
>> warnings for -Wliteral-conversion that were found by this patch. The
>> new warnings for -Wconstant-conversion (these were the vast majority
>> of the new warnings for my change  - though we don't use
>> -Wnull-conversion at the moment, so I haven't measured the increase in
>> that warning, for example) are a bit more difficult.
>>
>> While a lot of cases were legitimate, there are a few major false
>> positive cases:
>
> This sounds to me like "more trouble than it's worth". Did you find
> any interesting bugs with this?

Quite a few, yes. Here's a smattering of examples:

enum X { A, B, COUNT };
std::vector<bool> b(true, COUNT);

x &= !flag; // in xerces, actually

void log_if(int severity, bool condition);
log_if(condition, 3);

bool func() { ... return ERROR_CODE_FOO; } // various kinds of error
codes, often enums

bool b;
int i;
...
b = 10; // user seems to have jumbled up the variables, or their types
i = true;
// similar mistakes to this, except with function calls
("set_new_uid(5)" when the flag was really about whether a new uid is
created, not specifying the uid value itself)
// a lot of these, admittedly, come up in test code where more
constants are used - though I'm not sure how much better that makes me
feel about them

void func(int);
func(FLAG1 || FLAG2); // should be FLAG1 | FLAG2

if (FLAG1 || FLAG2) // should be "(x == FLAG1 || x == FLAG2)"

bool maxThings = INT_MAX; // fairly clear mistake in the declaration
of this type
void func(int);
func(maxThings);

(x & !(sizeof(void*) - 1)) // probably meant '~' not '!', I believe

if (0 == x && FLAG) // similar to previous examples

bool status;
...
status = -4; // yay, random constants!

while (1729) // I've no idea what this person had in mind... but OK,
probably working as they intended

if (some_const % other_const) // false positive

bool func() {
  ...
    return 0;
  ...
    return 1;
  ...
    return 2; // aha! :/
}

Well, that's a rough sample - the enum flag kind of cases seem pretty
common, or just passing literals of the wrong type to functions or
constructors (sometimes not as literals, but as constants defined
elsewhere).

>
> Nico
>
>>
>> * in the /existing/ warning, we have a 'false'-ish positive involving
>> code like this: int i = std::string::npos; ... if (i ==
>> std::string::npos) - npos is actually, say, LONG_MAX, so when stored
>> in an int it truncates to -1, but it compares == to -1 just fine.
>> Perhaps we could subcategorize -Wconstant-conversion to allow these
>> particular cases that happen to map back/forth non-destructively?
>>
>> * The major case of false positives with my improved warning amounts
>> to a use case like this: #define MY_ALLOC(Type, Count)
>> malloc(sizeof(Type) * ((Count) ? Count : 1)) // the actual code is a
>> bit more involved, but it's in Python's PyMem_NEW macro
>>  The problem is that when you pass a compile-time constant count, now
>> we appear to be truncating an integer (stuffing that big count into
>> zero or one of a boolean). It would be nice if we could somehow detect
>> the case where a macro parameter is used inside a constant expression
>> & flag that constant expression as "not so constant". This logic will
>> be necessary for improvements to Clang's unreachable code diagnostic
>> anyway (we need to know when constant expressions might still vary
>> depending on the build settings (or 'call' sites in the case of
>> macros))
>>  * equally, improvements to allow for sizeof expressions to trigger
>> similar "not quite constant" flags would be good. While "if
>> (sizeof(X))" is silly & we can happily warn on that, "if (sizeof(X) -
>> 3)" might be less clear cut (or sizeof in some other part of a
>> constant expression) - though I haven't seen (m)any false positives
>> like this.
>>
>> * Template parameters - this leads to code a lot like macros:
>> template<int N> void func() { ... if (N) { ... } }; I've currently
>> worked around this by having "IgnoreParenImpCasts" not ignore
>> SubstNonTypeTemplateParmExprs - this is a bit of a dirty hack (both
>> because this code was presumably written this way for a reason -
>> though removing it doesn't regress any test cases - and because I
>> don't think it falls down as soon as N is a subexpression such as "if
>> (N - 3)")
>>
>> Any thoughts on whether or not these are reasonable goals and how best
>> to achieve them would be most welcome,
>>
>> - David
>>
>>>
>>> - David
>>>
>>>> On Mon, Apr 2, 2012 at 6:03 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>>>> SemaChecking.cpp:3989 currently returns early from checking implicit
>>>>> conversions after it tests some specific X-to-boolean cases (including
>>>>> string and funciton literals) but before checking various other cases
>>>>> later on like NULL-to-X and wide integer literal to narrow integer.
>>>>>
>>>>> This change removes the early return, fixes the diagnostic (to
>>>>> correctly emit the fact that non-zero literals produce a "true"
>>>>> boolean value rather than simply truncating the larger integer
>>>>> literal), and updates the tests. In some cases the test cases were
>>>>> fixed or updated (//expected-warning), in others I simply suppressed
>>>>> the diagnostic because there adding the expected-warnings would've
>>>>> added a lot of noise to the test cases*.
>>>>>
>>>>> * This last case is a little bit questionable: in one specific case we
>>>>> produce a really good diagnostic about constant integer literals used
>>>>> in boolean contexts:
>>>>>
>>>>> int f1();
>>>>> bool f2() {
>>>>>  return f1() && 42;
>>>>> }
>>>>>
>>>>> we produce:
>>>>>
>>>>> conv.cpp:3:15: warning: use of logical '&&' with constant operand
>>>>> [-Wconstant-logical-operand]
>>>>>  return f1() && 42;
>>>>>              ^  ~~
>>>>> conv.cpp:3:15: note: use '&' for a bitwise operation
>>>>>  return f1() && 42;
>>>>>              ^~
>>>>>              &
>>>>> conv.cpp:3:15: note: remove constant to silence this warning
>>>>>  return f1() && 42;
>>>>>             ~^~~~~
>>>>>
>>>>> But then with my patch we get an extra diagnostic after the above warning/notes:
>>>>>
>>>>> conv.cpp:3:18: warning: implicit conversion from 'int' to 'bool'
>>>>> changes value from 42 to true [-Wconstant-conversion]
>>>>>  return f1() && 42;
>>>>>              ~~ ^~
>>>>>
>>>>> which isn't great - since we already gave a much more specific
>>>>> diagnosis of the problem in the first warning. If there's some nice
>>>>> way that we could suppress the second one whenever the first one is
>>>>> provided (unless the first one is only a warning and the second is
>>>>> -Werror'd?) I'd be happy to implement that.
>>>>>
>>>>>
>>>>>
>>>>> Another thing I noticed as I was exploring this. We have a warning for
>>>>> float-literal-to-int such as:
>>>>>
>>>>> conv.cpp:2:9: warning: implicit conversion turns literal
>>>>> floating-point number into integer: 'double' to 'int'
>>>>> [-Wliteral-conversion]
>>>>> int i = 3.1415;
>>>>>    ~   ^~~~~~
>>>>>
>>>>> But this warning is off-by-default. Why is that? It's already
>>>>> relatively conservative (allowing things like : "int i = 3.0" because
>>>>> 3.0 converts to an int without loss of precision) - though it's not a
>>>>> DiagnoseRuntimeBehavior, which it could be changed to (to be
>>>>> consistent with similar things for integers like "unsigned char c =
>>>>> 256").
>>>>>
>>>>> Or is it really that common to deliberately use floating point
>>>>> literals to initialize integer values?
>>>>>
>>>>> _______________________________________________
>>>>> cfe-commits mailing list
>>>>> cfe-commits at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>>>>