[PATCH] D126651: [clang-diff] Fix getStmtValue when dealing with wide chars

Johannes Altmanninger via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue May 31 14:48:37 PDT 2022


johannes requested changes to this revision.
johannes added a comment.
This revision now requires changes to proceed.

I wonder if clang-diff is useful in its current state, I remember there are many edge cases left over.

Anyway, I've left some minor comments. I'm not sure how to download a patch that Git can apply so I didn't try it yet.



================
Comment at: clang/lib/Tooling/ASTDiff/ASTDiff.cpp:468
+  if (auto *String = dyn_cast<StringLiteral>(S)) {
+    if (String->isWide()) {
+      unsigned int wsize = String->getByteLength() / String->getCharByteWidth();
----------------
I think there are some other cases that need fixing too.
1. isUTF16() (like `u'x'`)
2. isUTF32() (like `U'x'`)

Can you fix them too? Should be quite similar.


================
Comment at: clang/lib/Tooling/ASTDiff/ASTDiff.cpp:471
+      const wchar_t *temp =
+          reinterpret_cast<const wchar_t *>(String->getBytes().data());
+      std::wstring wstr(temp);
----------------
How about using "auto" here to avoid repeating the type?


================
Comment at: clang/lib/Tooling/ASTDiff/ASTDiff.cpp:474
+      std::string str;
+      if (!convertWideToUTF8(wstr.substr(0, wsize), str))
+        return "";
----------------
Instead of calling substr() it seems better to move this to the constructor

    std::wstring wstr(temp, wsize);

(I think the reason why we need wsize is because strings can have embedded null characters)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126651/new/

https://reviews.llvm.org/D126651



More information about the cfe-commits mailing list