r337470 - [Sema] Add a new warning, -Wmemset-transposed-args

Richard Smith via cfe-commits cfe-commits at lists.llvm.org
Thu Jul 26 12:52:27 PDT 2018


Other than the (fixed) ffmpeg false positive, I see this firing in three
places.

One of them is in the libc tests for memset and bzero, where the 0 argument
is intentional.
One of them is here:
https://github.com/apache/xerces-c/blob/trunk/src/xercesc/util/XMLUTF16Transcoder.cpp#L114
(note that this code is correct).
And one of them was a real bug where the second and third arguments of
memset were transposed.

That's an extremely low false positive rate, much lower than what we'd
expect for an enabled-by-default warning. I find this extremely surprising;
I would have expected the ratio of true-- to false-positives to be much
much higher. But ultimately data wins out over expectations.

How does our experience compare to other people's experiences? Are our
findings abnormal? (They may well be; our corpus is very C++-heavy.) If
this is indeed typical, we should reconsider whether to have these warnings
enabled by default, as they do not meet our bar for false positives.

On Mon, 23 Jul 2018 at 09:28, Erik Pilkington via cfe-commits <
cfe-commits at lists.llvm.org> wrote:

> Sure, that seems pretty reasonable. r337706. Thanks for the suggestion!
>
> On 7/21/18 1:40 PM, Arthur O'Dwyer wrote:
>
> In this case Clang is complaining about
>
>     memset(complicated-expr, 0, 0);
>
> which means that transposing the last two arguments would not "fix"
> anything. Clang ought to not complain in this case.
> It doesn't have anything to do with coming from a macro; it's just that
> *both* arguments are zero.
> (I would also expect that `memset(complicated-expr, (unsigned char)fill,
> (size_t)0)` shouldn't warn, either. But we had some disagreement that casts
> to `(unsigned char)` resp. `(size_t)` should be the right way to shut up
> the warning in the first place...)
> Basically Clang should only be suggesting a swap (and thus complaining at
> all) in the case that swapping the arguments would actually improve the
> situation.
>
> –Arthur
>
> On Sat, Jul 21, 2018 at 1:33 PM, Nico Weber via cfe-commits <
> cfe-commits at lists.llvm.org> wrote:
>
>> This has a false positive on ffmpeg:
>>
>> ../../third_party/ffmpeg/libavcodec/options.c:273:64: error: 'size'
>> argument to memset is '0'; did you mean to transpose the last two
>> arguments? [-Werror,-Wmemset-transposed-args]
>> alloc_and_copy_or_fail(intra_matrix, 64 * sizeof(int16_t), 0);
>>
>> With:
>>
>> #define alloc_and_copy_or_fail(obj, size, pad) \
>>     if (src->obj && size > 0) { \
>>         dest->obj = av_malloc(size + pad); \
>>         if (!dest->obj) \
>>             goto fail; \
>>         memcpy(dest->obj, src->obj, size); \
>>         if (pad) \
>>             memset(((uint8_t *) dest->obj) + size, 0, pad); \
>>     }
>>
>> (
>> https://cs.chromium.org/chromium/src/third_party/ffmpeg/libavcodec/options.c?q=alloc_and_copy_or_fail&sq=package:chromium&g=0&l=261
>> ; https://bugs.chromium.org/p/chromium/issues/detail?id=866202)
>>
>> Maybe the warning shouldn't fire if the memset is from a macro?
>>
>> On Thu, Jul 19, 2018 at 12:51 PM Erik Pilkington via cfe-commits <
>> cfe-commits at lists.llvm.org> wrote:
>>
>>> Author: epilk
>>> Date: Thu Jul 19 09:46:15 2018
>>> New Revision: 337470
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=337470&view=rev
>>> Log:
>>> [Sema] Add a new warning, -Wmemset-transposed-args
>>>
>>> This diagnoses calls to memset that have the second and third arguments
>>> transposed, for example:
>>>
>>>   memset(buf, sizeof(buf), 0);
>>>
>>> This is done by checking if the third argument is a literal 0, or if the
>>> second
>>> is a sizeof expression (and the third isn't). The first check is also
>>> done for
>>> calls to bzero.
>>>
>>> Differential revision: https://reviews.llvm.org/D49112
>>>
>>> Added:
>>>     cfe/trunk/test/Sema/transpose-memset.c
>>> Modified:
>>>     cfe/trunk/include/clang/Basic/DiagnosticGroups.td
>>>     cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
>>>     cfe/trunk/lib/Sema/SemaChecking.cpp
>>>
>>> Modified: cfe/trunk/include/clang/Basic/DiagnosticGroups.td
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticGroups.td?rev=337470&r1=337469&r2=337470&view=diff
>>>
>>> ==============================================================================
>>> --- cfe/trunk/include/clang/Basic/DiagnosticGroups.td (original)
>>> +++ cfe/trunk/include/clang/Basic/DiagnosticGroups.td Thu Jul 19
>>> 09:46:15 2018
>>> @@ -443,6 +443,13 @@ def : DiagGroup<"synth">;
>>>  def SizeofArrayArgument : DiagGroup<"sizeof-array-argument">;
>>>  def SizeofArrayDecay : DiagGroup<"sizeof-array-decay">;
>>>  def SizeofPointerMemaccess : DiagGroup<"sizeof-pointer-memaccess">;
>>> +def MemsetTransposedArgs : DiagGroup<"memset-transposed-args">;
>>> +def DynamicClassMemaccess : DiagGroup<"dynamic-class-memaccess">;
>>> +def NonTrivialMemaccess : DiagGroup<"nontrivial-memaccess">;
>>> +def SuspiciousBzero : DiagGroup<"suspicious-bzero">;
>>> +def SuspiciousMemaccess : DiagGroup<"suspicious-memaccess",
>>> +  [SizeofPointerMemaccess, DynamicClassMemaccess,
>>> +   NonTrivialMemaccess, MemsetTransposedArgs, SuspiciousBzero]>;
>>>  def StaticInInline : DiagGroup<"static-in-inline">;
>>>  def StaticLocalInInline : DiagGroup<"static-local-in-inline">;
>>>  def GNUStaticFloatInit : DiagGroup<"gnu-static-float-init">;
>>>
>>> Modified: cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td?rev=337470&r1=337469&r2=337470&view=diff
>>>
>>> ==============================================================================
>>> --- cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td (original)
>>> +++ cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td Thu Jul 19
>>> 09:46:15 2018
>>> @@ -619,14 +619,14 @@ def warn_cstruct_memaccess : Warning<
>>>    "%select{destination for|source of|first operand of|second operand
>>> of}0 this "
>>>    "%1 call is a pointer to record %2 that is not trivial to "
>>>    "%select{primitive-default-initialize|primitive-copy}3">,
>>> -  InGroup<DiagGroup<"nontrivial-memaccess">>;
>>> +  InGroup<NonTrivialMemaccess>;
>>>  def note_nontrivial_field : Note<
>>>    "field is non-trivial to %select{copy|default-initialize}0">;
>>>  def warn_dyn_class_memaccess : Warning<
>>>    "%select{destination for|source of|first operand of|second operand
>>> of}0 this "
>>>    "%1 call is a pointer to %select{|class containing a }2dynamic class
>>> %3; "
>>>    "vtable pointer will be %select{overwritten|copied|moved|compared}4">,
>>> -  InGroup<DiagGroup<"dynamic-class-memaccess">>;
>>> +  InGroup<DynamicClassMemaccess>;
>>>  def note_bad_memaccess_silence : Note<
>>>    "explicitly cast the pointer to silence this warning">;
>>>  def warn_sizeof_pointer_expr_memaccess : Warning<
>>> @@ -655,7 +655,19 @@ def note_memsize_comparison_paren : Note
>>>    "did you mean to compare the result of %0 instead?">;
>>>  def note_memsize_comparison_cast_silence : Note<
>>>    "explicitly cast the argument to size_t to silence this warning">;
>>> -
>>> +def warn_suspicious_sizeof_memset : Warning<
>>> +  "%select{'size' argument to memset is '0'|"
>>> +  "setting buffer to a 'sizeof' expression}0"
>>> +  "; did you mean to transpose the last two arguments?">,
>>> +  InGroup<MemsetTransposedArgs>;
>>> +def note_suspicious_sizeof_memset_silence : Note<
>>> +  "%select{parenthesize the third argument|"
>>> +  "cast the second argument to 'int'}0 to silence">;
>>> +def warn_suspicious_bzero_size : Warning<"'size' argument to bzero is
>>> '0'">,
>>> +  InGroup<SuspiciousBzero>;
>>> +def note_suspicious_bzero_size_silence : Note<
>>> +  "parenthesize the second argument to silence">;
>>> +
>>>  def warn_strncat_large_size : Warning<
>>>    "the value of the size argument in 'strncat' is too large, might lead
>>> to a "
>>>    "buffer overflow">, InGroup<StrncatSize>;
>>>
>>> Modified: cfe/trunk/lib/Sema/SemaChecking.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaChecking.cpp?rev=337470&r1=337469&r2=337470&view=diff
>>>
>>> ==============================================================================
>>> --- cfe/trunk/lib/Sema/SemaChecking.cpp (original)
>>> +++ cfe/trunk/lib/Sema/SemaChecking.cpp Thu Jul 19 09:46:15 2018
>>> @@ -8629,24 +8629,26 @@ static const CXXRecordDecl *getContained
>>>    return nullptr;
>>>  }
>>>
>>> +static const UnaryExprOrTypeTraitExpr *getAsSizeOfExpr(const Expr *E) {
>>> +  if (const auto *Unary = dyn_cast<UnaryExprOrTypeTraitExpr>(E))
>>> +    if (Unary->getKind() == UETT_SizeOf)
>>> +      return Unary;
>>> +  return nullptr;
>>> +}
>>> +
>>>  /// If E is a sizeof expression, returns its argument expression,
>>>  /// otherwise returns NULL.
>>>  static const Expr *getSizeOfExprArg(const Expr *E) {
>>> -  if (const UnaryExprOrTypeTraitExpr *SizeOf =
>>> -      dyn_cast<UnaryExprOrTypeTraitExpr>(E))
>>> -    if (SizeOf->getKind() == UETT_SizeOf && !SizeOf->isArgumentType())
>>> +  if (const UnaryExprOrTypeTraitExpr *SizeOf = getAsSizeOfExpr(E))
>>> +    if (!SizeOf->isArgumentType())
>>>        return SizeOf->getArgumentExpr()->IgnoreParenImpCasts();
>>> -
>>>    return nullptr;
>>>  }
>>>
>>>  /// If E is a sizeof expression, returns its argument type.
>>>  static QualType getSizeOfArgType(const Expr *E) {
>>> -  if (const UnaryExprOrTypeTraitExpr *SizeOf =
>>> -      dyn_cast<UnaryExprOrTypeTraitExpr>(E))
>>> -    if (SizeOf->getKind() == UETT_SizeOf)
>>> -      return SizeOf->getTypeOfArgument();
>>> -
>>> +  if (const UnaryExprOrTypeTraitExpr *SizeOf = getAsSizeOfExpr(E))
>>> +    return SizeOf->getTypeOfArgument();
>>>    return QualType();
>>>  }
>>>
>>> @@ -8742,6 +8744,86 @@ struct SearchNonTrivialToCopyField
>>>
>>>  }
>>>
>>> +/// Detect if \c SizeofExpr is likely to calculate the sizeof an object.
>>> +static bool doesExprLikelyComputeSize(const Expr *SizeofExpr) {
>>> +  SizeofExpr = SizeofExpr->IgnoreParenImpCasts();
>>> +
>>> +  if (const auto *BO = dyn_cast<BinaryOperator>(SizeofExpr)) {
>>> +    if (BO->getOpcode() != BO_Mul && BO->getOpcode() != BO_Add)
>>> +      return false;
>>> +
>>> +    return doesExprLikelyComputeSize(BO->getLHS()) ||
>>> +           doesExprLikelyComputeSize(BO->getRHS());
>>> +  }
>>> +
>>> +  return getAsSizeOfExpr(SizeofExpr) != nullptr;
>>> +}
>>> +
>>> +/// Check if the ArgLoc originated from a macro passed to the call at
>>> CallLoc.
>>> +///
>>> +/// \code
>>> +///   #define MACRO 0
>>> +///   foo(MACRO);
>>> +///   foo(0);
>>> +/// \endcode
>>> +///
>>> +/// This should return true for the first call to foo, but not for the
>>> second
>>> +/// (regardless of whether foo is a macro or function).
>>> +static bool isArgumentExpandedFromMacro(SourceManager &SM,
>>> +                                        SourceLocation CallLoc,
>>> +                                        SourceLocation ArgLoc) {
>>> +  if (!CallLoc.isMacroID())
>>> +    return SM.getFileID(CallLoc) != SM.getFileID(ArgLoc);
>>> +
>>> +  return SM.getFileID(SM.getImmediateMacroCallerLoc(CallLoc)) !=
>>> +         SM.getFileID(SM.getImmediateMacroCallerLoc(ArgLoc));
>>> +}
>>> +
>>> +/// Diagnose cases like 'memset(buf, sizeof(buf), 0)', which should
>>> have the
>>> +/// last two arguments transposed.
>>> +static void CheckMemaccessSize(Sema &S, unsigned BId, const CallExpr
>>> *Call) {
>>> +  if (BId != Builtin::BImemset && BId != Builtin::BIbzero)
>>> +    return;
>>> +
>>> +  const Expr *SizeArg =
>>> +    Call->getArg(BId == Builtin::BImemset ? 2 : 1)->IgnoreImpCasts();
>>> +
>>> +  // If we're memsetting or bzeroing 0 bytes, then this is likely an
>>> error.
>>> +  SourceLocation CallLoc = Call->getRParenLoc();
>>> +  SourceManager &SM = S.getSourceManager();
>>> +  if (isa<IntegerLiteral>(SizeArg) &&
>>> +      cast<IntegerLiteral>(SizeArg)->getValue() == 0 &&
>>> +      !isArgumentExpandedFromMacro(SM, CallLoc, SizeArg->getExprLoc()))
>>> {
>>> +
>>> +    SourceLocation DiagLoc = SizeArg->getExprLoc();
>>> +
>>> +    // Some platforms #define bzero to __builtin_memset. See if this is
>>> the
>>> +    // case, and if so, emit a better diagnostic.
>>> +    if (BId == Builtin::BIbzero ||
>>> +        (CallLoc.isMacroID() && Lexer::getImmediateMacroName(
>>> +                                    CallLoc, SM, S.getLangOpts()) ==
>>> "bzero")) {
>>> +      S.Diag(DiagLoc, diag::warn_suspicious_bzero_size);
>>> +      S.Diag(DiagLoc, diag::note_suspicious_bzero_size_silence);
>>> +    } else {
>>> +      S.Diag(DiagLoc, diag::warn_suspicious_sizeof_memset) << 0;
>>> +      S.Diag(DiagLoc, diag::note_suspicious_sizeof_memset_silence) << 0;
>>> +    }
>>> +    return;
>>> +  }
>>> +
>>> +  // If the second argument to a memset is a sizeof expression and the
>>> third
>>> +  // isn't, this is also likely an error. This should catch
>>> +  // 'memset(buf, sizeof(buf), 0xff)'.
>>> +  if (BId == Builtin::BImemset &&
>>> +      doesExprLikelyComputeSize(Call->getArg(1)) &&
>>> +      !doesExprLikelyComputeSize(Call->getArg(2))) {
>>> +    SourceLocation DiagLoc = Call->getArg(1)->getExprLoc();
>>> +    S.Diag(DiagLoc, diag::warn_suspicious_sizeof_memset) << 1;
>>> +    S.Diag(DiagLoc, diag::note_suspicious_sizeof_memset_silence) << 1;
>>> +    return;
>>> +  }
>>> +}
>>> +
>>>  /// Check for dangerous or invalid arguments to memset().
>>>  ///
>>>  /// This issues warnings on known problematic, dangerous or unspecified
>>> @@ -8771,6 +8853,9 @@ void Sema::CheckMemaccessArguments(const
>>>                                       Call->getLocStart(),
>>> Call->getRParenLoc()))
>>>      return;
>>>
>>> +  // Catch cases like 'memset(buf, sizeof(buf), 0)'.
>>> +  CheckMemaccessSize(*this, BId, Call);
>>> +
>>>    // We have special checking when the length is a sizeof expression.
>>>    QualType SizeOfArgTy = getSizeOfArgType(LenExpr);
>>>    const Expr *SizeOfArg = getSizeOfExprArg(LenExpr);
>>>
>>> Added: cfe/trunk/test/Sema/transpose-memset.c
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Sema/transpose-memset.c?rev=337470&view=auto
>>>
>>> ==============================================================================
>>> --- cfe/trunk/test/Sema/transpose-memset.c (added)
>>> +++ cfe/trunk/test/Sema/transpose-memset.c Thu Jul 19 09:46:15 2018
>>> @@ -0,0 +1,60 @@
>>> +// RUN: %clang_cc1       -Wmemset-transposed-args -verify %s
>>> +// RUN: %clang_cc1 -xc++ -Wmemset-transposed-args -verify %s
>>> +
>>> +#define memset(...) __builtin_memset(__VA_ARGS__)
>>> +#define bzero(x,y) __builtin_memset(x, 0, y)
>>> +#define real_bzero(x,y) __builtin_bzero(x,y)
>>> +
>>> +int array[10];
>>> +int *ptr;
>>> +
>>> +int main() {
>>> +  memset(array, sizeof(array), 0); // expected-warning{{'size' argument
>>> to memset is '0'; did you mean to transpose the last two arguments?}}
>>> expected-note{{parenthesize the third argument to silence}}
>>> +  memset(array, sizeof(array), 0xff); // expected-warning{{setting
>>> buffer to a 'sizeof' expression; did you mean to transpose the last two
>>> arguments?}} expected-note{{cast the second argument to 'int' to silence}}
>>> +  memset(ptr, sizeof(ptr), 0); // expected-warning{{'size' argument to
>>> memset is '0'; did you mean to transpose the last two arguments?}}
>>> expected-note{{parenthesize the third argument to silence}}
>>> +  memset(ptr, sizeof(*ptr) * 10, 1); // expected-warning{{setting
>>> buffer to a 'sizeof' expression; did you mean to transpose the last two
>>> arguments?}} expected-note{{cast the second argument to 'int' to silence}}
>>> +  memset(ptr, 10 * sizeof(int *), 1); // expected-warning{{setting
>>> buffer to a 'sizeof' expression; did you mean to transpose the last two
>>> arguments?}} expected-note{{cast the second argument to 'int' to silence}}
>>> +  memset(ptr, 10 * sizeof(int *) + 10, 0xff); //
>>> expected-warning{{setting buffer to a 'sizeof' expression; did you mean to
>>> transpose the last two arguments?}} expected-note{{cast the second argument
>>> to 'int' to silence}}
>>> +  memset(ptr, sizeof(char) * sizeof(int *), 0xff); //
>>> expected-warning{{setting buffer to a 'sizeof' expression; did you mean to
>>> transpose the last two arguments?}} expected-note{{cast the second argument
>>> to 'int' to silence}}
>>> +  memset(array, sizeof(array), sizeof(array)); // Uh... fine I guess.
>>> +  memset(array, 0, sizeof(array));
>>> +  memset(ptr, 0, sizeof(int *) * 10);
>>> +  memset(array, (int)sizeof(array), (0)); // no warning
>>> +  memset(array, (int)sizeof(array), 32); // no warning
>>> +  memset(array, 32, (0)); // no warning
>>> +
>>> +  bzero(ptr, 0); // expected-warning{{'size' argument to bzero is '0'}}
>>> expected-note{{parenthesize the second argument to silence}}
>>> +  real_bzero(ptr, 0); // expected-warning{{'size' argument to bzero is
>>> '0'}} expected-note{{parenthesize the second argument to silence}}
>>> +}
>>> +
>>> +void macros() {
>>> +#define ZERO 0
>>> +  int array[10];
>>> +  memset(array, 0xff, ZERO); // no warning
>>> +  // Still emit a diagnostic for memsetting a sizeof expression:
>>> +  memset(array, sizeof(array), ZERO); // expected-warning{{'sizeof'}}
>>> expected-note{{cast}}
>>> +  bzero(array, ZERO); // no warning
>>> +  real_bzero(array, ZERO); // no warning
>>> +#define NESTED_DONT_DIAG                        \
>>> +  memset(array, 0xff, ZERO);                    \
>>> +  real_bzero(array, ZERO);
>>> +
>>> +  NESTED_DONT_DIAG;
>>> +
>>> +#define NESTED_DO_DIAG                          \
>>> +  memset(array, 0xff, 0);                       \
>>> +  real_bzero(array, 0)
>>> +
>>> +  NESTED_DO_DIAG; // expected-warning{{'size' argument to memset}}
>>> expected-warning{{'size' argument to bzero}} expected-note2{{parenthesize}}
>>> +
>>> +#define FN_MACRO(p)                             \
>>> +  memset(array, 0xff, p)
>>> +
>>> +  FN_MACRO(ZERO);
>>> +  FN_MACRO(0); // FIXME: should we diagnose this?
>>> +
>>> +  __builtin_memset(array, 0, ZERO); // no warning
>>> +  __builtin_bzero(array, ZERO);
>>> +  __builtin_memset(array, 0, 0); // expected-warning{{'size' argument
>>> to memset}} // expected-note{{parenthesize}}
>>> +  __builtin_bzero(array, 0); // expected-warning{{'size' argument to
>>> bzero}} // expected-note{{parenthesize}}
>>> +}
>>>
>>>
>>> _______________________________________________
>>> cfe-commits mailing list
>>> cfe-commits at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>>>
>>
>> _______________________________________________
>> cfe-commits mailing list
>> cfe-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>>
>>
>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20180726/7e78f6cf/attachment-0001.html>


More information about the cfe-commits mailing list