[PATCH] D53509: Add unsgined char StringRef constructor/Fix llvm-strings crash

James Henderson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 22 09:33:29 PDT 2018


jhenderson created this revision.
jhenderson added reviewers: compnerd, ruiu, mstorsjo, zturner, MTC, vsk.
Herald added subscribers: kristina, dexonsmith.

On Windows at least, llvm-strings was crashing if it encountered bytes that mapped to negative chars, as it was passing these into std::isgraph and std::isblank functions, resulting in undefined behaviour. On debug builds using MSVC, these functions verfiy that the value passed in is representable as an `unsigned char`. Since the `char` is promoted to an `int`, a value greater than 127 would turn into a negative integer value, and fail the check.

To make the fix cleaner, I have added an `unsigned char*` constructor for `StringRef`. This allows result of `bytes_begin`/`bytes_end` to create another `StringRef`.


Repository:
  rL LLVM

https://reviews.llvm.org/D53509

Files:
  include/llvm/ADT/StringRef.h
  test/tools/llvm-strings/negative.test
  tools/llvm-strings/llvm-strings.cpp


Index: tools/llvm-strings/llvm-strings.cpp
===================================================================
--- tools/llvm-strings/llvm-strings.cpp
+++ tools/llvm-strings/llvm-strings.cpp
@@ -77,9 +77,10 @@
     OS << " " << L << '\n';
   };
 
-  const char *B = Contents.begin();
-  const char *P = nullptr, *E = nullptr, *S = nullptr;
-  for (P = Contents.begin(), E = Contents.end(); P < E; ++P) {
+  const unsigned char *B = Contents.bytes_begin();
+  const unsigned char *E = Contents.bytes_end();
+  const unsigned char *S = nullptr;
+  for (const unsigned char *P = B; P < E; ++P) {
     if (std::isgraph(*P) || std::isblank(*P)) {
       if (S == nullptr)
         S = P;
Index: test/tools/llvm-strings/negative.test
===================================================================
--- test/tools/llvm-strings/negative.test
+++ test/tools/llvm-strings/negative.test
@@ -0,0 +1 @@
+# RUN: echo -e \x80 | llvm-strings -
Index: include/llvm/ADT/StringRef.h
===================================================================
--- include/llvm/ADT/StringRef.h
+++ include/llvm/ADT/StringRef.h
@@ -90,6 +90,11 @@
     /*implicit*/ constexpr StringRef(const char *data, size_t length)
         : Data(data), Length(length) {}
 
+    /// Construct a string ref from a pointer to bytes and length.
+    LLVM_ATTRIBUTE_ALWAYS_INLINE
+    /*implicit*/ constexpr StringRef(const unsigned char *data, size_t length)
+        : Data(reinterpret_cast<const char *>(data)), Length(length) {}
+
     /// Construct a string ref from an std::string.
     LLVM_ATTRIBUTE_ALWAYS_INLINE
     /*implicit*/ StringRef(const std::string &Str)


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D53509.170443.patch
Type: text/x-patch
Size: 1631 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20181022/07ad22e8/attachment.bin>


More information about the llvm-commits mailing list