[cfe-dev] clang printf correspondence between conversion specifier and arguments

Cristian Draghici cristian.draghici at gmail.com
Thu Jan 21 23:58:49 PST 2010


Hi

I am attaching a new version of the patch that takes into account length
modifiers for ints.
The previous version emits wrong warnings when running on ints preceded by
length modifiers.

I'm not sure this is the best way to handle them - I'm consuming modifiers
at conversion specifier time by walking backwards into the format string
until I hit the '%' conversion opening char.

I also left out modifiers j,t,z,q (intmax_t, ptrdiff_t, etc) - I'm not sure
how to handle these.

This new version breaks an existing test from Clang::Sema/format-strings.c
because it traps the use of "ld" on an int.

======
  sprintf(buf,"%%%%%ld%d%d", 1, 2, 3); // no-warning
  snprintf(buf, 2, "%%%%%ld%;%d", 1, 2, 3); // expected-warning {{sion
'%;'}}
======

Thanks,
Cristi



On Thu, Jan 21, 2010 at 12:17 PM, Cristian Draghici <
cristian.draghici at gmail.com> wrote:

> Hi
>
> I am attaching a patch for "%i" and "%d".
>
> I've not carried on with the rest of the conversion specifiers because they
> should probably be handled by a single block of code (vs having one common
> block per type of argument - i.e. one for int, unsigned int, etc).
>
> i.e.
>
> builtinTypeKind conversionSpecifierType;
> case 'i':
> case 'd':
>     conversionSpecifierType = BuiltinType::Int;
> case 'o':
> case 'u':
> case 'X':
> case 'x':
>    conversionSpecifierType = BuiltinType::UInt;
> [..]
> case 'p':
>      /* do diagnostics here based on the conversionSpecifierType and the
> current argument */
>
>
> Thanks,
> Cristi
> PS Is there a way to get the string representation of the built-in type?
> (i.e. 'int' out of 'BuiltinType::Int')
>
>
>
>
> On Thu, Jan 21, 2010 at 8:24 AM, Ted Kremenek <kremenek at apple.com> wrote:
>
>> Hi Cristi,
>>
>> This is definitely on the right track.
>>
>> To add a warning, you only need to modify the TableGen files (.td), as the
>> .inc files are generated from the .td files during the build.  Specifically,
>> you only need to modify DiagnosticSemaKinds.td and add one warning to the
>> Format group.  Your warning declaration would look something like:
>>
>>  def my_warning : Warning<
>>  "conversion specifies type '%0' but the argument has type '%1'">,
>> InGroup<Format>;
>>
>> You would then use it similarly to the
>> warn_printf_asterisk_precision_wrong_type warning (in SemaChecking.cpp),
>> except you specify two QualType arguments instead of one.
>>
>> Please also include a couple test cases that show when the warning is both
>> reported and *not* reported.  The test cases should also include the use of
>> typedefs (which I believe your logic already handles).
>>
>> - Ted
>>
>> On Jan 19, 2010, at 11:26 PM, Cristian Draghici wrote:
>>
>> > Hi
>> >
>> > I've started looking at printf correspondence between conversion
>> specifier and arguments.
>> >
>> > 1/ Am I on the right track with the patch below?
>> >
>> > 2/ Assuming an affirmative answer on (1), I'm not sure how to define a
>> new warning (I used warn_printf_asterisk_width_wrong_type which is wrong in
>> this case).
>> > I'm guessing defs are in include/clang/Basic/DiagnosticSemaKinds.inc and
>> .td and referred in DiagnosticGroups.inc and .td
>> > A new diagnostic would be needed, along the lines of "conversion
>> specifies type 'X', but argument has type 'Y'"
>> >
>> >
>> > Thanks for your time,
>> > Cristi
>> >
>> >
>> > cristi:clang diciu$ svn diff  ./lib/Sema/SemaChecking.cpp
>> > Index: lib/Sema/SemaChecking.cpp
>> > ===================================================================
>> > --- lib/Sema/SemaChecking.cpp (revision 93981)
>> > +++ lib/Sema/SemaChecking.cpp (working copy)
>> > @@ -1147,7 +1147,17 @@
>> >      //
>> >      // FIXME: additional checks will go into the following cases.
>> >      case 'i':
>> > -    case 'd':
>> > +    case 'd': {
>> > +      if(numDataArgs >= format_idx+numConversions+1) {
>> > +        const Expr *E2 = TheCall->getArg(format_idx+numConversions+1);
>> > +        const BuiltinType *BT = E2->getType()->getAs<BuiltinType>();
>> > +        if(BT == NULL || BT->getKind() != BuiltinType::Int) {
>> > +          SourceLocation Loc = getLocationOfStringLiteralByte(FExpr,
>> StrIdx);
>> > +          Diag(Loc, diag::warn_printf_asterisk_width_wrong_type)
>> > +            << E2->getType() << E2->getSourceRange();
>> > +        }
>> > +      }
>> > +    }
>> >      case 'o':
>> >      case 'u':
>> >      case 'x':
>> >
>> >
>> >
>> > _______________________________________________
>> > cfe-dev mailing list
>> > cfe-dev at cs.uiuc.edu
>> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20100122/ce9e3e7e/attachment.html>
-------------- next part --------------
Index: test/Sema/format-strings.c
===================================================================
--- test/Sema/format-strings.c	(revision 94069)
+++ test/Sema/format-strings.c	(working copy)
@@ -103,6 +103,16 @@
   printf("%.*d","foo",x); // expected-warning {{field precision should have type 'int', but argument has type 'char *'}}
 }
 
+void check_conversion_specifier_vs_argument(int x) {
+  printf("%i", x); // no warning
+  printf("%i", "foo"); // expected-warning {{conversion specifies type 'int' but the argument has type 'char *'}}
+  printf("%i %i", x, "foo"); // expected-warning {{conversion specifies type 'int' but the argument has type 'char *'}}
+
+  typedef int myint;
+  myint i1=12;
+  printf("%i", i1); // no warning
+}
+
 void __attribute__((format(printf,1,3))) myprintf(const char*, int blah, ...);
 
 void test_myprintf() {
Index: include/clang/Basic/DiagnosticSemaKinds.td
===================================================================
--- include/clang/Basic/DiagnosticSemaKinds.td	(revision 94069)
+++ include/clang/Basic/DiagnosticSemaKinds.td	(working copy)
@@ -2435,6 +2435,9 @@
 def warn_printf_asterisk_precision_wrong_type : Warning<
   "field precision should have type 'int', but argument has type %0">,
   InGroup<Format>;
+def warn_printf_conversion_argument_type_mismatch : Warning<
+  "conversion specifies type '%0' but the argument has type %1">,
+  InGroup<Format>;
 
 // CHECK: returning address/reference of stack memory
 def warn_ret_stack_addr : Warning<
Index: lib/Sema/SemaChecking.cpp
===================================================================
--- lib/Sema/SemaChecking.cpp	(revision 94069)
+++ lib/Sema/SemaChecking.cpp	(working copy)
@@ -1155,7 +1155,42 @@
     //
     // FIXME: additional checks will go into the following cases.
     case 'i':
-    case 'd':
+    case 'd': {
+      char modifiers[2] = {'\0', '\0'};
+      if(Str[StrIdx-1] != '%' && (Str[StrIdx-1] == 'h' || Str[StrIdx-1] == 'l')) {
+        modifiers[0] = Str[StrIdx-1];
+        if(Str[StrIdx-2] != '%' && (Str[StrIdx-2] == 'h' || Str[StrIdx-2] == 'l'))
+          modifiers[1] = Str[StrIdx-2];
+      }
+
+      if(!HasVAListArg && numDataArgs + format_idx > numConversions) {
+        const Expr *E2 = TheCall->getArg(numConversions+format_idx+1);
+        const BuiltinType *BT = E2->getType()->getAs<BuiltinType>();
+        SourceLocation Loc = getLocationOfStringLiteralByte(FExpr, StrIdx);
+
+        if((modifiers[0] == 'h' && modifiers[1] == 'h') && 
+            (BT == NULL || BT->getKind() != BuiltinType::SChar))
+              Diag(Loc, diag::warn_printf_conversion_argument_type_mismatch)
+                << "signed char" << E2->getType() << E2->getSourceRange();
+        else if((modifiers[0] == 'h' && modifiers[1] == '\0') &&
+            (BT == NULL || BT->getKind() != BuiltinType::Short))
+              Diag(Loc, diag::warn_printf_conversion_argument_type_mismatch)
+                << "short" << E2->getType() << E2->getSourceRange();
+        else if((modifiers[0] == 'l' && modifiers[1] == 'l') &&
+            (BT == NULL || BT->getKind() != BuiltinType::LongLong))
+              Diag(Loc, diag::warn_printf_conversion_argument_type_mismatch)
+                << "long long" << E2->getType() << E2->getSourceRange();
+        else if((modifiers[0] == 'l' && modifiers[1] == '\0') &&
+            (BT == NULL || BT->getKind() != BuiltinType::Long))
+              Diag(Loc, diag::warn_printf_conversion_argument_type_mismatch)
+                << "long" << E2->getType() << E2->getSourceRange();
+	else if((modifiers[0] == '\0') && 
+            (BT == NULL || BT->getKind() != BuiltinType::Int)) 
+            Diag(Loc, diag::warn_printf_conversion_argument_type_mismatch)
+              << "int" << E2->getType() << E2->getSourceRange();
+        
+      }
+    }
     case 'o':
     case 'u':
     case 'x':


More information about the cfe-dev mailing list