[PATCH] D132568: [clang][Sema] check default argument promotions for printf

Aaron Ballman via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Aug 25 07:29:32 PDT 2022


aaron.ballman added subscribers: dexonsmith, rjmccall.
aaron.ballman added a comment.

In D132568#3747598 <https://reviews.llvm.org/D132568#3747598>, @nickdesaulniers wrote:

> In D132568#3746551 <https://reviews.llvm.org/D132568#3746551>, @aaron.ballman wrote:
>
>> Thank you for the patch, but it'd be really helpful to me as a reviewer if you and @nickdesaulniers could coordinate so there's only one patch trying to address #57102 instead of two competing patches (I'm happy to review whichever patch comes out). From a quick look over the test case changes, this patch looks like it's more in line with how I'd expect to resolve that issue compared to D132266 <https://reviews.llvm.org/D132266>.
>
> The addition to clang/test/Sema/format-strings.c looks like it will resolve Linus' complaints in https://github.com/llvm/llvm-project/issues/57102; I'm happy to abandon D132266 <https://reviews.llvm.org/D132266> in preference of this.

Thanks, Nick!

I asked some questions about `wchar_t` behavior in the thread, but I sort of wonder if we should handle those concerns separately (if changes are needed) as it looks more like missing functionality than promotion-related behavior. We have other changes we'll need in this area anyway for C23 because of https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2630.pdf, https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2680.pdf, and https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2763.pdf so there's no need to cram it all into one patch.



================
Comment at: clang/include/clang/AST/FormatString.h:263-264
     NoMatch = 0,
     /// The conversion specifier and the argument type are compatible. For
     /// instance, "%d" and _Bool.
     Match = 1,
----------------



================
Comment at: clang/lib/AST/FormatString.cpp:367
+        case BuiltinType::Char_U:
+        case BuiltinType::Bool:
+          return Match;
----------------
Isn't this a match promotion as well? e.g. `printf("%hhd", (_Bool)0);` where the `_Bool` is promoted to `int` but then cast back down to a char type (which seems rather unlikely to be a type confusion issue).


================
Comment at: clang/lib/AST/FormatString.cpp:378
+          case BuiltinType::Short:
+          case BuiltinType::UShort:
+            return NoMatchPromotionTypeConfusion;
----------------
In C, `wchar_t` is an integer type and that underlying type might be `int` or it might be promotable to `int`. Should we handle it here as a type confused promotion match?


================
Comment at: clang/lib/AST/FormatString.cpp:401
+      if (const auto *BT = argTy->getAs<BuiltinType>()) {
+        if (!Ptr) {
+          switch (BT->getKind()) {
----------------
It's a bit strange that we have two switches over the same `BT->getKind()` and the only difference is `!Ptr`; would it be easier to read if we combined the two switches into one and had logic in the individual cases for `Ptr` vs not `Ptr`?


================
Comment at: clang/lib/AST/FormatString.cpp:408
+            if (T == C.SignedCharTy || T == C.UnsignedCharTy ||
+                T == C.ShortTy || T == C.UnsignedShortTy)
+              return MatchPromotion;
----------------
`wchar_t`?


================
Comment at: clang/lib/AST/FormatString.cpp:413-414
+          case BuiltinType::UShort:
+            if (T == C.SignedCharTy || T == C.UnsignedCharTy || T == C.IntTy ||
+                T == C.UnsignedIntTy)
+              return NoMatchPromotionTypeConfusion;
----------------
I agree that `printf("%hhd", (short)0);` is potentially type confused, but why `printf("%d", (short)0);`?

FWIW, I think that `printf("%lc", (short)0));` is potentially type confused as well.


================
Comment at: clang/lib/AST/FormatString.cpp:422
+          case BuiltinType::Bool:
+            // Don't warn printf("%hd", [char])
+            // https://reviews.llvm.org/D66186
----------------
The comment is talking about passing a `char` (promoted to `int`) then printed as a `short` but the check is looking for the type to be printed as an `int`; that doesn't seem like a type confusion situation so much as a match due to promotion. Should the test below be looking for a short type instead of an int type?


================
Comment at: clang/lib/AST/FormatString.cpp:429
+          case BuiltinType::WChar_S:
+            return NoMatchPromotionTypeConfusion;
+          }
----------------
Why is this automatically type confusion regardless of the type specified by the format string?


================
Comment at: clang/lib/Sema/SemaChecking.cpp:10114
   } else if (const CharacterLiteral *CL = dyn_cast<CharacterLiteral>(E)) {
     // Special case for 'a', which has type 'int' in C.
     // Note, however, that we do /not/ want to treat multibyte constants like
----------------
Do we need a similar special case for `L'a'` in C, which has type `wchar_t` (which is a typedef to an integer type)?


================
Comment at: clang/test/FixIt/format.m:40
 
-void test_object_correction (id x) {  
+void test_object_correction (id x) {
   NSLog(@"%d", x); // expected-warning{{format specifies type 'int' but the argument has type 'id'}}
----------------
It looks like some whitespace only changes snuck in.


================
Comment at: clang/test/FixIt/format.m:173-176
-  NSLog(@"%hhd", 'a'); // expected-warning{{format specifies type 'char' but the argument has type 'int'}}
-  // CHECK: fix-it:"{{.*}}":{[[@LINE-1]]:11-[[@LINE-1]]:15}:"%d"
-
-  NSLog(@"%hhu", 'a'); // expected-warning{{format specifies type 'unsigned char' but the argument has type 'int'}}
----------------
Should we leave the test but remove the expected warnings and fix-it comments instead?


================
Comment at: clang/test/FixIt/format.m:195-208
-  NSLog(@"%C", 0x260300);  // expected-warning{{format specifies type 'unichar' (aka 'unsigned short') but the argument has type 'int'}}
-  // CHECK-NOT: fix-it:"{{.*}}":{[[@LINE-1]]:11-[[@LINE-1]]:13}:"%d"
-  // CHECK: fix-it:"{{.*}}":{[[@LINE-2]]:16-[[@LINE-2]]:16}:"(unsigned short)"
 
   typedef unsigned short unichar;
-  
-  NSLog(@"%C", 0x260300);  // expected-warning{{format specifies type 'unichar' (aka 'unsigned short') but the argument has type 'int'}}
----------------
Hmmm, did we intend to change the behavior here? `%C` is an extension, so it's not clear to me if we wanted to modify this behavior or not. CC @rjmccall and @dexonsmith for some ObjC opinions.


================
Comment at: clang/test/FixIt/format.mm:14-25
-  NSLog(@"%C", 0x260300);  // expected-warning{{format specifies type 'unichar' (aka 'unsigned short') but the argument has type 'int'}}
-  // CHECK-NOT: fix-it:"{{.*}}":{[[@LINE-1]]:11-[[@LINE-1]]:13}:"%d"
-  // CHECK: fix-it:"{{.*}}":{[[@LINE-2]]:16-[[@LINE-2]]:16}:"(unsigned short)"
-
   typedef unsigned short unichar;
 
   NSLog(@"%C", wchar_data);  // expected-warning{{format specifies type 'unichar' (aka 'unsigned short') but the argument has type 'wchar_t'}}
----------------
Same questions here as above.


================
Comment at: clang/test/Sema/format-strings-scanf.c:15
                       unsigned char : (signed char)0))
-typedef __SSIZE_TYPE__ ssize_t;                         
+typedef __SSIZE_TYPE__ ssize_t;
 
----------------
Looks like more whitespace-only changes snuck in.


================
Comment at: clang/test/Sema/format-strings-scanf.c:265-266
+  // char & uchar
+  scanf("%hhd", &sc); // no-warning
+  scanf("%hhd", &uc); // no-warning
+  scanf("%hd", &sc); // expected-warning{{format specifies type 'short *' but the argument has type 'signed char *'}}
----------------
WG14 hasn't made an official determination, but one test we should add is:
```
// FIXME: does this match what the C committee allows or should it be pedantically warned on?
char c;
void *vp;
scanf("%hhd", &c); // Pedantic warning?
scanf("%hhd", vp); // Pedantic warning?
```
I asked on the reflectors whether folks wanted me to file an NB comment about this and the answers have been mixed so far. I'll most likely still file an NB comment to get the committee officially on record though.


================
Comment at: clang/test/Sema/format-strings-scanf.c:289-290
+  // ill-formed floats
+  scanf("%hf", // expected-warning{{length modifier 'h' results in undefined behavior or no effect with 'f' conversion specifier}}
+  &sc); // Is this a clang bug ?
+
----------------
Is what a clang bug?

Also, what is with the "or no effect" in that wording? "This could cause major issues or nothing bad at all might happen" is a very odd way to word a diagnostic. :-D (Maybe something to look into improving in a later patch.)


================
Comment at: clang/test/Sema/format-strings-scanf.c:271-272
+  scanf("%llx", &i); // expected-warning{{format specifies type 'unsigned long long *' but the argument has type 'int *'}}
+  scanf("%hf", // expected-warning{{length modifier 'h' results in undefined behavior or no effect with 'f' conversion specifier}}
+  &sc); // Is this a clang bug ?
+  scanf("%s", i); // expected-warning{{format specifies type 'char *' but the argument has type 'int'}}
----------------
nickdesaulniers wrote:
> Hmm...
@nickdesaulniers -- are you thinking this might be a pedantic warning because it's always valid to cast to a pointer to char and write into it? e.g.,
```
short s;
char *cp = (char *)&s;
*cp = 0; // Not UB
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D132568/new/

https://reviews.llvm.org/D132568



More information about the cfe-commits mailing list