[cfe-commits] [patch] Warn in string literal + integral

Nico Weber thakis at chromium.org
Fri Mar 2 09:53:15 PST 2012


On Thu, Mar 1, 2012 at 11:12 PM, Richard Smith <richard at metafoo.co.uk> wrote:
> On Thu, Mar 1, 2012 at 9:26 PM, Nico Weber <thakis at chromium.org> wrote:
>>
>> >> > On Tue, Feb 21, 2012 at 7:17 PM, Nico Weber <thakis at chromium.org>
>> >> > wrote:
>> >> >>
>> >> >> the attached patch adds -Wstring-plus-int (I'm open to more creative
>> >> >> names :-) ), which warns on code like |"error code: " + err|. The
>> >> >> warning found 4 bugs in 2 files in chromium. The bugs it found look
>> >> >> like this:
>> >> >>
>> >> >> File 1:
>> >> >>    if (!read_node.InitByIdLookup(it->id)) {
>> >> >>      error_handler()->OnUnrecoverableError(FROM_HERE, "Failed to
>> >> >> look
>> >> >> up "
>> >> >>          " data for received change with id " + it->id);
>> >> >>      return;
>> >> >>    }
>> >> >>
>> >> >> File 2:
>> >> >>    ResetStream(stream_id, spdy::INVALID_ASSOCIATED_STREAM,
>> >> >>      "Received OnSyn with inactive associated stream " +
>> >> >>      associated_stream_id);
>> >> >> (Fix:
>> >> >>
>> >> >> http://codereview.chromium.org/9372076/diff/1/net/spdy/spdy_session.cc
>> >> >> )
>> >> >>
>> >> >> (A coworker found the bug in the one file, which prompted me writing
>> >> >> the warning. The warning then found the bug in the other file.)
>> >> >>
>> >> >>
>> >> >> When building all of chromium and its many dependencies, the warning
>> >> >> did also find 3 false positives, but they all look alike: Ffmepg
>> >> >> contains three different functions that all look like this:
>> >> >
>> >> >
>> >> > 4 bugs and 3 false positives doesn't sound great to me.
>> >>
>> >> True. All of the the 3 false positives are really the same snippet
>> >> though. (On the other hand, 3 of the 4 bugs are very similar as well.)
>> >> The bugs this did find are crash bugs, and they were in rarely run
>> >> error-logging code, so it does find interesting bugs. Maybe you guys
>> >> can run it on your codebase after it's in, and we can move it out of
>> >> -Wmost if you consider it too noisy in practice?
>> >
>> >
>> > Sounds reasonable.
>> >
>> >>
>> >> > Have you considered
>> >> > relaxing the warning in cases where the integral summand is a
>> >> > constant
>> >> > expression and is in-bounds? Would this pattern have matched any of
>> >> > your
>> >> > real bugs?
>> >>
>> >> That's a good idea (for my bugs, the int was always a declrefexpr,
>> >> never a literal). The ffmpeg example roughly looks like this:
>> >>
>> >>  #define A "foo"
>> >>  #define B "bar"
>> >>  consume(A B + sizeof(A) - 1);
>> >>
>> >> The RHS is just "sizeof(A)" without the "- 1", but since A B has a
>> >> length of 6, this still makes this warning go away. I added this to
>> >> the patch. With this change, it's 4 bugs / 0 false positives.
>> >>
>> >> Note that this suppresses the warning for most enums which start at 0.
>> >> Or did you mean to do this only for non-enum constant expressions?
>> >
>> >
>> > I could imagine code legitimately wanting to do something like this:
>> >
>> > template</*...*/>
>> > struct S {
>> >   enum { Offset = /* ... */ };
>> >   static const char *str = "some string" + Offset;
>> > };
>> >
>> > That said, I'm happy for us to warn on such code. Whatever you prefer
>> > seems
>> > fine to me; we can refine this later, if a need arises.
>>
>> Enums are handled like ints: They warn if the offset is out of bounds,
>> but don't if it's they are in bounds.
>>
>> >> > Onto the patch...
>> >> >
>> >> > Index: include/clang/Basic/DiagnosticGroups.td
>> >> > ===================================================================
>> >> > --- include/clang/Basic/DiagnosticGroups.td (revision 150418)
>> >> > +++ include/clang/Basic/DiagnosticGroups.td (working copy)
>> >> > @@ -320,6 +321,7 @@
>> >> >      ReturnType,
>> >> >      SelfAssignment,
>> >> >      SizeofArrayArgument,
>> >> > +    StringPlusInt,
>> >> >      Trigraphs,
>> >> >      Uninitialized,
>> >> >      UnknownPragmas,
>> >> >
>> >> > Given the current level of false positives, I'm not completely
>> >> > convinced
>> >> > this should go into -Wmost. I imagine we'll know more once we've run
>> >> > this on
>> >> > more code.
>> >> >
>> >> > Index: include/clang/Basic/DiagnosticSemaKinds.td
>> >> > ===================================================================
>> >> > --- include/clang/Basic/DiagnosticSemaKinds.td (revision 150418)
>> >> > +++ include/clang/Basic/DiagnosticSemaKinds.td (working copy)
>> >> > @@ -3414,6 +3414,12 @@
>> >> >    "explicitly assigning a variable of type %0 to itself">,
>> >> >    InGroup<SelfAssignment>, DefaultIgnore;
>> >> >
>> >> > +def warn_string_plus_int : Warning<
>> >> > +  "Adding an integral to a string does not append to the string">,
>> >> >
>> >> > Diagnostics outside the static analyzer should not be capitalized.
>> >>
>> >> Done.
>> >>
>> >> > +  InGroup<StringPlusInt>;
>> >> > +def note_string_plus_int_silence : Note<
>> >> > +  "use &(str[index]) if pointer arithmetic is intended">;
>> >> >
>> >> > The parentheses here are redundant. I think I'd prefer for the
>> >> > suggestion to
>> >> > be provided via fixits (and to reword this as something like "use
>> >> > array
>> >> > indexing to silence this warning").
>> >>
>> >> Clang tends to suggest very explicit parentheses (see
>> >> -Wparentheses-logical-op for example).
>> >
>> >
>> > Usually only in cases where there is a potential bug. If we saw
>> > &"foo"[2],
>> > we wouldn't be likely to suspect that the user might have really meant
>> > (&"foo")[2], so my (weak) preference is to suggest the simpler fix.
>>
>> Removed the parens.
>>
>> > (Incidentally, -Warray-bounds should be taught to complain about
>> > (&"foo")[2]!)
>>
>> Patches accepted! :-D
>>
>> >> I did change it to a fixit and
>> >> reworded, it now looks like so:
>> >>
>> >> test/SemaCXX/string-plus-int.cpp:12:17: warning: adding an integral to
>> >> a string does not append to the string [-Wstring-plus-int]
>> >>  consume("foo" + 5);
>> >>          ~~~~~~^~~
>> >> test/SemaCXX/string-plus-int.cpp:12:17: note: use array indexing to
>> >> silence this warning
>> >>  consume("foo" + 5);
>> >>                ^
>> >>          &(    )[ ]
>> >
>> >
>> > This looks nice ... though if we're going to have parens in the fixit,
>> > they
>> > should contain the [] :-)
>> >
>> >>
>> >> > Index: lib/Sema/SemaExpr.cpp
>> >> > ===================================================================
>> >> > --- lib/Sema/SemaExpr.cpp (revision 150418)
>> >> > +++ lib/Sema/SemaExpr.cpp (working copy)
>> >> > @@ -7798,6 +7798,20 @@
>> >> >      ParensRange);
>> >> >  }
>> >> >
>> >> > +/// DiagnoseStringPlusInt - Emit a warning when adding an integer to
>> >> > a
>> >> > string
>> >> > +/// literal.
>> >> > +static void DiagnoseStringPlusInt(Sema &Self, SourceLocation OpLoc,
>> >> > +                                  Expr *LHSExpr, Expr *RHSExpr) {
>> >> > +  bool IsStringPlusInt = dyn_cast<StringLiteral>(LHSExpr) &&
>> >> > +      RHSExpr->getType()->isIntegralType(Self.getASTContext());
>> >> >
>> >> > We should probably also check for (unscoped) enumeration types, and
>> >> > for
>> >> > class types with implicit conversion to int. Doing so will require
>> >> > moving
>> >> > this check to checkArithmeticOpPointerOperand (after we've checked
>> >> > for
>> >> > an
>> >> > overloaded operator+).
>> >>
>> >> Added a check for (unscoped) enumeration types. class types with
>> >> implicit conversion to int feel like overkill to me.
>> >
>> >
>> > OK. You should still move the check to after we check for an overloaded
>> > operator+, since operator+ can be overloaded for enums. (Once that's
>> > done,
>> > you may find it's easier to warn on such implicitly-convertible types
>> > than
>> > to not do so.)
>>
>> Done, added an enum-overloaded operator+ to the test.
>>
>> >> > +  if (!IsStringPlusInt)
>> >> > +    return;
>> >> > +
>> >> > +  SourceRange DiagRange(LHSExpr->getLocStart(),
>> >> > RHSExpr->getLocEnd());
>> >> > +  Self.Diag(OpLoc, diag::warn_string_plus_int) << DiagRange;
>> >> > +  Self.Diag(OpLoc, diag::note_string_plus_int_silence);
>> >> > +}
>> >> >
>> >> > We should also catch the case where the integer is on the LHS (and
>> >> > skip
>> >> > the
>> >> > fixit in that case).
>> >>
>> >> I tried that. It found neither additional bugs nor additional false
>> >> positives, so it's probably not worth it?
>> >
>> >
>> > Fine, there's no point adding this if it doesn't actually catch any
>> > bugs. I
>> > have only empirical evidence that it might -- I once saw[*] the moral
>> > equivalent of:
>> >
>> >   int n = /*...*/;
>> >   std::string s = n + " widgets found";
>> >
>> > [*] a long time ago, in a codebase far away...
>>
>> Ok, I added the warning for the int on the left as well (without fixit
>> in that case, like you suggested upthread).
>
>
> Just a few more things:
>
> +def warn_string_plus_int : Warning<
> +  "adding an integral to a string does not append to the string">,
> +  InGroup<StringPlusInt>;
>
> Rather than "an integral", can we say something like:
>
>   adding %0 to a string literal does not append to the string
>
> (where %0 is the type)?

Done.

> +  if (IndexExpr->EvaluateAsInt(index, Self.getASTContext())) {
> +    if (index.isNonNegative() &&
> +        index < llvm::APSInt(llvm::APInt(index.getBitWidth(),
> +                                         StrExpr->getByteLength()),
>
> This should be <=, not <, I think: we want to allow "foo"+4 (a pointer past
> the null) but warn on "foo"+5 (which has undefined behavior). And
> getByteLength() should be getLength(), to properly handle wide string
> literals. We'll get duplicate warnings for the same issue if
> -Warray-bounds-pointer-arithmetic is enabled and we carry on here, but such
> code is rare enough that I don't think that's a pressing concern.

Done (-ish: getLength() didn't include the trailing \0, so I'm doing
<= getLength() + 1), added a test.

> Other than that, a test for the fix-it would be great (see test/FixIt for
> examples).

Wouldn't that require the fixit to be on the warning instead of the
note? (If so, maybe there's no need for the test yet?)

Nico
-------------- next part --------------
A non-text attachment was scrubbed...
Name: clang-str-plus-int-v4.patch
Type: application/octet-stream
Size: 10706 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120302/727a1391/attachment.obj>


More information about the cfe-commits mailing list