[PATCH] D129223: [Clang] Fix invalid utf-8 detection

Corentin Jabot via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 6 13:19:23 PDT 2022


cor3ntin created this revision.
Herald added a subscriber: hiraditya.
Herald added a project: All.
cor3ntin requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

The length of valid codepoints was incorrectly
calculated which was not caught before because the
absence of tests for the valid codepoints scenario.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D129223

Files:
  clang/test/Lexer/comment-invalid-utf8.c
  llvm/lib/Support/ConvertUTF.cpp


Index: llvm/lib/Support/ConvertUTF.cpp
===================================================================
--- llvm/lib/Support/ConvertUTF.cpp
+++ llvm/lib/Support/ConvertUTF.cpp
@@ -423,7 +423,7 @@
  */
 unsigned getUTF8SequenceSize(const UTF8 *source, const UTF8 *sourceEnd) {
   int length = trailingBytesForUTF8[*source] + 1;
-  return (length > sourceEnd - source && isLegalUTF8(source, length)) ? length
+  return (length < sourceEnd - source && isLegalUTF8(source, length)) ? length
                                                                       : 0;
 }
 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D129223.442673.patch
Type: text/x-patch
Size: 571 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220706/92778afb/attachment.bin>


More information about the llvm-commits mailing list