[PATCH] D46274: [Support] Harden JSON against invalid UTF-8.

Sam McCall via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 9 08:53:36 PDT 2018


sammccall added inline comments.


================
Comment at: lib/Support/JSON.cpp:520
+bool isUTF8(llvm::StringRef S, size_t *ErrOffset) {
+  // Fast-path for ASCII, which is valid UTF-8.
+  for (unsigned char C : S)
----------------
benhamilton wrote:
> sammccall wrote:
> > benhamilton wrote:
> > > Wouldn't it make sense to move this to `isLegalUTF8String()`?
> > > 
> > Hmm.. maybe. I'm slightly leery about this, as these are common Unicode reference functions that are largely unmodified, which people may expect.
> > Also I think we'd still want to split it into two functions so the ASCII check can be inlined and the utf-8 wrangling outlined. It seems harmless enough here, but maybe I'm just lazy. WDYT?
> I'm supportive of an inline-able ASCII check if we don't already have one.
Added one to `StringExtras.h` (Unicode.h and ConvertUTF.h have weird style and no dependencies, it's hard to work out how to make it fit). 


Repository:
  rL LLVM

https://reviews.llvm.org/D46274





More information about the llvm-commits mailing list