[PATCH] D103426: Clang: Extend format string checking to wprintf/wscanf

Marcus Johnson via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Jun 9 20:57:39 PDT 2021


MarcusJohnson91 added inline comments.


================
Comment at: clang/lib/AST/Type.cpp:1980-2002
+bool Type::isChar16Type(const LangOptions &LangOpts) const {
   if (const auto *BT = dyn_cast<BuiltinType>(CanonicalType))
-    return BT->getKind() == BuiltinType::Char16;
-  return false;
-}
-
-bool Type::isChar32Type() const {
-  if (const auto *BT = dyn_cast<BuiltinType>(CanonicalType))
-    return BT->getKind() == BuiltinType::Char32;
-  return false;
-}
-
-/// Determine whether this type is any of the built-in character
-/// types.
-bool Type::isAnyCharacterType() const {
-  const auto *BT = dyn_cast<BuiltinType>(CanonicalType);
-  if (!BT) return false;
-  switch (BT->getKind()) {
-  default: return false;
-  case BuiltinType::Char_U:
-  case BuiltinType::UChar:
-  case BuiltinType::WChar_U:
-  case BuiltinType::Char8:
-  case BuiltinType::Char16:
-  case BuiltinType::Char32:
-  case BuiltinType::Char_S:
-  case BuiltinType::SChar:
-  case BuiltinType::WChar_S:
-    return true;
+    if (BT->getKind() == BuiltinType::Char16)
+      return true;
+  if (!LangOpts.CPlusPlus) {
+    return isType("char16_t");
   }
----------------
aaron.ballman wrote:
> If I understand properly, one issue is that `char16_t` is defined by C to be *the same type* as `uint_least16_t`, which itself is a typedef to some other integer type. So we can't make `char16_t` be a distinct type in C, it has to be a typedef to a typedef to an integer type.
> 
> One concern I have about the approach in this patch is that it makes the type system a bit more convoluted. This type is *not* really a character type in C, it's a typedef type that playacts as a character type. I think this is a pretty fundamental change to the type system in the compiler in some ways because this query is no longer about the canonical type but about the semantics of how the type is expected to be used.
> 
> I'd definitely like to hear what @rsmith thinks of this approach.
I see your point, but I'm not sure how else we could get it through the type checking system without doing it this way?

I tried to be careful to only allow the actual character typedefs through by making sure char16_t or char32_t is in the typedef chain, and only in C mode.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103426/new/

https://reviews.llvm.org/D103426



More information about the cfe-commits mailing list