[PATCH] D46274: [Support] Harden JSON against invalid UTF-8.

Ben Hamilton via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 2 08:07:22 PDT 2018


benhamilton added a comment.

Ah, glad we did have a solution for this. Wish I'd known, would have saved everyone time. :)



================
Comment at: lib/Support/JSON.cpp:520
+bool isUTF8(llvm::StringRef S, size_t *ErrOffset) {
+  // Fast-path for ASCII, which is valid UTF-8.
+  for (unsigned char C : S)
----------------
Wouldn't it make sense to move this to `isLegalUTF8String()`?



================
Comment at: lib/Support/JSON.cpp:527
+not_ascii:
+  const UTF8 *Data = reinterpret_cast<const UTF8*>(S.data()), *Rest = Data;
+  bool OK = LLVM_LIKELY(isLegalUTF8String(&Rest, Data + S.size()));
----------------
Style: Space between `UTF8` and `*`?



================
Comment at: lib/Support/JSON.cpp:537
+  std::vector<UTF32> Codepoints(S.size()); // 1 codepoint per byte suffices.
+  const UTF8 *In8 = reinterpret_cast<const UTF8*>(S.data());
+  UTF32 *Out32 = Codepoints.data();
----------------
Style: Space between `UTF8` and `*`?



Repository:
  rL LLVM

https://reviews.llvm.org/D46274





More information about the llvm-commits mailing list