[PATCH] D103426: Clang: Extend format string checking to wprintf/wscanf
Eli Friedman via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Jul 24 11:20:34 PDT 2021
efriedma added a comment.
This should split into three patches:
1. The ConvertUTFWrapper patch.
2. The patch to add format warnings for wprintf/wscanf/etc.
3. The patch to add the %l16/%l32 format modifiers.
================
Comment at: clang/lib/AST/Expr.cpp:1091
+ if (llvm::convertUTF32ToUTF8String(AR, Output)) {
+ CString = new char[Output.size() + 1];
+ memcpy(CString, Output.c_str(), Output.size());
----------------
This leaks memory.
This function should return either a StringRef to memory that's part of AST object, or an std::string.
================
Comment at: clang/lib/AST/OSLog.cpp:206
const StringLiteral *Lit = cast<StringLiteral>(StringArg->IgnoreParenCasts());
- assert(Lit && (Lit->isAscii() || Lit->isUTF8()));
- StringRef Data = Lit->getString();
+ assert(Lit);
+ std::string String(Lit->getStrDataAsChar());
----------------
Why are you changing this code?
================
Comment at: clang/lib/AST/Type.cpp:1962
+bool Type::isType(const std::string TypeName) const {
+ QualType Desugar = this->getLocallyUnqualifiedSingleStepDesugaredType();
----------------
MarcusJohnson91 wrote:
> aaron.ballman wrote:
> > Oh, I see now that this is doing a name comparison against the type -- that's not a good API in general because it's *really* hard to guess at what the type will come out as textually. e.g., `class` and `struct` keywords are interchangeable in C++, C sometimes gets confused with `bool` vs `_Bool`, template arguments sometimes matter, nested name specifiers, etc. Callers of this API will have to guess at these details and the printing of the type may change over time (e.g., C may switch from `_Bool` to `bool` and then code calling `isType("_Bool")` may react poorly to the change).
> >
> > I think we need to avoid this sort of API on `Type`.
> I see your point, I reverted the behavior back to doing the desugaring in just isChar16Type and isChar32Type
I'm not convinced we should be looking at sugar even in isChar16Type/isChar32Type/isAnyCharacterType. That seems like a great way to end up with subtle bugs that only manifest when someone uses the wrong typedef.
Where is the distinction between the value `(uint32_t)1` vs. `(char32_t)1` relevant for C, anyway?
================
Comment at: clang/lib/Sema/SemaChecking.cpp:103
+#include <locale>
+#include <codecvt>
----------------
Stray includes?
================
Comment at: clang/test/Sema/format-strings-non-iso.c:6
+
+int wprintf(const char *restrict, ...);
+int wscanf(const char *restrict, ...);
----------------
This declaration is wrong? Does that not cause a warning somewhere?
================
Comment at: llvm/lib/Support/ConvertUTFWrapper.cpp:148
+ // Error out on an uneven byte count.
+ if (SrcBytes.size() % 2)
+ return false;
----------------
sizeof(UTF32)?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D103426/new/
https://reviews.llvm.org/D103426
More information about the llvm-commits
mailing list