[cfe-commits] [patch] Warn in string literal + integral

Nico Weber thakis at chromium.org
Thu Mar 1 21:26:06 PST 2012


>> > On Tue, Feb 21, 2012 at 7:17 PM, Nico Weber <thakis at chromium.org> wrote:
>> >>
>> >> the attached patch adds -Wstring-plus-int (I'm open to more creative
>> >> names :-) ), which warns on code like |"error code: " + err|. The
>> >> warning found 4 bugs in 2 files in chromium. The bugs it found look
>> >> like this:
>> >>
>> >> File 1:
>> >>    if (!read_node.InitByIdLookup(it->id)) {
>> >>      error_handler()->OnUnrecoverableError(FROM_HERE, "Failed to look
>> >> up "
>> >>          " data for received change with id " + it->id);
>> >>      return;
>> >>    }
>> >>
>> >> File 2:
>> >>    ResetStream(stream_id, spdy::INVALID_ASSOCIATED_STREAM,
>> >>      "Received OnSyn with inactive associated stream " +
>> >>      associated_stream_id);
>> >> (Fix:
>> >> http://codereview.chromium.org/9372076/diff/1/net/spdy/spdy_session.cc
>> >> )
>> >>
>> >> (A coworker found the bug in the one file, which prompted me writing
>> >> the warning. The warning then found the bug in the other file.)
>> >>
>> >>
>> >> When building all of chromium and its many dependencies, the warning
>> >> did also find 3 false positives, but they all look alike: Ffmepg
>> >> contains three different functions that all look like this:
>> >
>> >
>> > 4 bugs and 3 false positives doesn't sound great to me.
>>
>> True. All of the the 3 false positives are really the same snippet
>> though. (On the other hand, 3 of the 4 bugs are very similar as well.)
>> The bugs this did find are crash bugs, and they were in rarely run
>> error-logging code, so it does find interesting bugs. Maybe you guys
>> can run it on your codebase after it's in, and we can move it out of
>> -Wmost if you consider it too noisy in practice?
>
>
> Sounds reasonable.
>
>>
>> > Have you considered
>> > relaxing the warning in cases where the integral summand is a constant
>> > expression and is in-bounds? Would this pattern have matched any of your
>> > real bugs?
>>
>> That's a good idea (for my bugs, the int was always a declrefexpr,
>> never a literal). The ffmpeg example roughly looks like this:
>>
>>  #define A "foo"
>>  #define B "bar"
>>  consume(A B + sizeof(A) - 1);
>>
>> The RHS is just "sizeof(A)" without the "- 1", but since A B has a
>> length of 6, this still makes this warning go away. I added this to
>> the patch. With this change, it's 4 bugs / 0 false positives.
>>
>> Note that this suppresses the warning for most enums which start at 0.
>> Or did you mean to do this only for non-enum constant expressions?
>
>
> I could imagine code legitimately wanting to do something like this:
>
> template</*...*/>
> struct S {
>   enum { Offset = /* ... */ };
>   static const char *str = "some string" + Offset;
> };
>
> That said, I'm happy for us to warn on such code. Whatever you prefer seems
> fine to me; we can refine this later, if a need arises.

Enums are handled like ints: They warn if the offset is out of bounds,
but don't if it's they are in bounds.

>> > Onto the patch...
>> >
>> > Index: include/clang/Basic/DiagnosticGroups.td
>> > ===================================================================
>> > --- include/clang/Basic/DiagnosticGroups.td (revision 150418)
>> > +++ include/clang/Basic/DiagnosticGroups.td (working copy)
>> > @@ -320,6 +321,7 @@
>> >      ReturnType,
>> >      SelfAssignment,
>> >      SizeofArrayArgument,
>> > +    StringPlusInt,
>> >      Trigraphs,
>> >      Uninitialized,
>> >      UnknownPragmas,
>> >
>> > Given the current level of false positives, I'm not completely convinced
>> > this should go into -Wmost. I imagine we'll know more once we've run
>> > this on
>> > more code.
>> >
>> > Index: include/clang/Basic/DiagnosticSemaKinds.td
>> > ===================================================================
>> > --- include/clang/Basic/DiagnosticSemaKinds.td (revision 150418)
>> > +++ include/clang/Basic/DiagnosticSemaKinds.td (working copy)
>> > @@ -3414,6 +3414,12 @@
>> >    "explicitly assigning a variable of type %0 to itself">,
>> >    InGroup<SelfAssignment>, DefaultIgnore;
>> >
>> > +def warn_string_plus_int : Warning<
>> > +  "Adding an integral to a string does not append to the string">,
>> >
>> > Diagnostics outside the static analyzer should not be capitalized.
>>
>> Done.
>>
>> > +  InGroup<StringPlusInt>;
>> > +def note_string_plus_int_silence : Note<
>> > +  "use &(str[index]) if pointer arithmetic is intended">;
>> >
>> > The parentheses here are redundant. I think I'd prefer for the
>> > suggestion to
>> > be provided via fixits (and to reword this as something like "use array
>> > indexing to silence this warning").
>>
>> Clang tends to suggest very explicit parentheses (see
>> -Wparentheses-logical-op for example).
>
>
> Usually only in cases where there is a potential bug. If we saw &"foo"[2],
> we wouldn't be likely to suspect that the user might have really meant
> (&"foo")[2], so my (weak) preference is to suggest the simpler fix.

Removed the parens.

> (Incidentally, -Warray-bounds should be taught to complain about
> (&"foo")[2]!)

Patches accepted! :-D

>> I did change it to a fixit and
>> reworded, it now looks like so:
>>
>> test/SemaCXX/string-plus-int.cpp:12:17: warning: adding an integral to
>> a string does not append to the string [-Wstring-plus-int]
>>  consume("foo" + 5);
>>          ~~~~~~^~~
>> test/SemaCXX/string-plus-int.cpp:12:17: note: use array indexing to
>> silence this warning
>>  consume("foo" + 5);
>>                ^
>>          &(    )[ ]
>
>
> This looks nice ... though if we're going to have parens in the fixit, they
> should contain the [] :-)
>
>>
>> > Index: lib/Sema/SemaExpr.cpp
>> > ===================================================================
>> > --- lib/Sema/SemaExpr.cpp (revision 150418)
>> > +++ lib/Sema/SemaExpr.cpp (working copy)
>> > @@ -7798,6 +7798,20 @@
>> >      ParensRange);
>> >  }
>> >
>> > +/// DiagnoseStringPlusInt - Emit a warning when adding an integer to a
>> > string
>> > +/// literal.
>> > +static void DiagnoseStringPlusInt(Sema &Self, SourceLocation OpLoc,
>> > +                                  Expr *LHSExpr, Expr *RHSExpr) {
>> > +  bool IsStringPlusInt = dyn_cast<StringLiteral>(LHSExpr) &&
>> > +      RHSExpr->getType()->isIntegralType(Self.getASTContext());
>> >
>> > We should probably also check for (unscoped) enumeration types, and for
>> > class types with implicit conversion to int. Doing so will require
>> > moving
>> > this check to checkArithmeticOpPointerOperand (after we've checked for
>> > an
>> > overloaded operator+).
>>
>> Added a check for (unscoped) enumeration types. class types with
>> implicit conversion to int feel like overkill to me.
>
>
> OK. You should still move the check to after we check for an overloaded
> operator+, since operator+ can be overloaded for enums. (Once that's done,
> you may find it's easier to warn on such implicitly-convertible types than
> to not do so.)

Done, added an enum-overloaded operator+ to the test.

>> > +  if (!IsStringPlusInt)
>> > +    return;
>> > +
>> > +  SourceRange DiagRange(LHSExpr->getLocStart(), RHSExpr->getLocEnd());
>> > +  Self.Diag(OpLoc, diag::warn_string_plus_int) << DiagRange;
>> > +  Self.Diag(OpLoc, diag::note_string_plus_int_silence);
>> > +}
>> >
>> > We should also catch the case where the integer is on the LHS (and skip
>> > the
>> > fixit in that case).
>>
>> I tried that. It found neither additional bugs nor additional false
>> positives, so it's probably not worth it?
>
>
> Fine, there's no point adding this if it doesn't actually catch any bugs. I
> have only empirical evidence that it might -- I once saw[*] the moral
> equivalent of:
>
>   int n = /*...*/;
>   std::string s = n + " widgets found";
>
> [*] a long time ago, in a codebase far away...

Ok, I added the warning for the int on the left as well (without fixit
in that case, like you suggested upthread).

Nico
-------------- next part --------------
A non-text attachment was scrubbed...
Name: clang-str-plus-int-v3.patch
Type: application/octet-stream
Size: 10026 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120301/7c1c483e/attachment.obj>


More information about the cfe-commits mailing list