[PATCH] D79810: [FileCheck] Fix isalpha/isalnum calls
Joel E. Denny via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue May 12 15:05:08 PDT 2020
jdenny created this revision.
jdenny added reviewers: jhenderson, thopre, probinson.
Herald added subscribers: atanasyan, hiraditya, arichardson, sdardis.
Herald added a project: LLVM.
D79276 <https://reviews.llvm.org/D79276> caused the following builder to fail:
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/23489
Specifically, FileCheck dumped stack in the following tests:
LLVM :: MC/Mips/micromips-jump-pc-region.s
LLVM :: MC/Mips/mips-jump-pc-region.s
Those tests contain characters encoded as 160 but that render (at
least for me in vim) like a single space (32). Those characters
appear between the `#` and `RUN:` on several lines, and D79276 <https://reviews.llvm.org/D79276> causes
FileCheck to process those lines differently: `RUN:` is a comment
directive. As a result, D79276 <https://reviews.llvm.org/D79276> causes FileCheck to start calling is
`isalnum` on those characters.
The problem is that FileCheck calls `isalnum` on type `char` without
casting to `unsigned char` first, so it sign-extends 160 beyond what
`unsigned char` or `EOF` can represent. C says that has undefined
behavior. This problem is general to FileCheck's prefix parsing and
so exists independently of D79276 <https://reviews.llvm.org/D79276>. This patch changes all such calls
to explicitly cast to `unsigned char`, and it adds tests for cases
that are representative with or without D79276 <https://reviews.llvm.org/D79276>.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D79810
Files:
llvm/lib/Support/FileCheck.cpp
llvm/test/FileCheck/bad-char.txt
Index: llvm/test/FileCheck/bad-char.txt
===================================================================
--- /dev/null
+++ llvm/test/FileCheck/bad-char.txt
@@ -0,0 +1,36 @@
+# This file contains characters that render as spaces (at least for me in vim)
+# but are encoded as 160. Each is indicated with a "^" on the following line.
+# FileCheck used to call functions like isalnum on each without casting to
+# unsigned char first, so it sign-extended beyond what unsigned char or EOF can
+# represent. C says that has undefined behavior, and it has caused stack dumps
+# under Windows.
+
+ BEFORE-PREFIX:
+^
+AFTER-PREFIX :
+ ^
+BEFORE-VAR: [[ VAR:]]
+ ^
+AFTER-VAR: [[VAR :]]
+ ^
+
+RUN: %ProtectFileCheckOutput \
+RUN: not FileCheck -check-prefix=BEFORE-PREFIX %s < /dev/null 2>&1 | \
+RUN: FileCheck -check-prefix=ERR-EMPTY-CHECK %s
+
+RUN: %ProtectFileCheckOutput \
+RUN: not FileCheck -check-prefix=AFTER-PREFIX %s < /dev/null 2>&1 | \
+RUN: FileCheck -check-prefix=ERR-NO-CHECK %s
+
+RUN: %ProtectFileCheckOutput \
+RUN: not FileCheck -check-prefix=BEFORE-VAR %s < /dev/null 2>&1 | \
+RUN: FileCheck -check-prefix=ERR-BAD-VAR %s
+
+RUN: %ProtectFileCheckOutput \
+RUN: not FileCheck -check-prefix=AFTER-VAR %s < /dev/null 2>&1 | \
+RUN: FileCheck -check-prefix=ERR-BAD-STRING-VAR %s
+
+ERR-EMPTY-CHECK: error: found empty check string
+ERR-NO-CHECK: error: no check strings found
+ERR-BAD-VAR: error: invalid variable name
+ERR-BAD-STRING-VAR: error: invalid name in string variable definition
Index: llvm/lib/Support/FileCheck.cpp
===================================================================
--- llvm/lib/Support/FileCheck.cpp
+++ llvm/lib/Support/FileCheck.cpp
@@ -151,7 +151,9 @@
return Regex::escape(*VarVal);
}
-bool Pattern::isValidVarNameStart(char C) { return C == '_' || isalpha(C); }
+bool Pattern::isValidVarNameStart(char C) {
+ return C == '_' || isalpha((unsigned char)C);
+}
Expected<Pattern::VariableProperties>
Pattern::parseVariable(StringRef &Str, const SourceMgr &SM) {
@@ -171,7 +173,7 @@
return ErrorDiagnostic::get(SM, Str, "invalid variable name");
// Variable names are composed of alphanumeric characters and underscores.
- if (Str[I] != '_' && !isalnum(Str[I]))
+ if (Str[I] != '_' && !isalnum((unsigned char)Str[I]))
break;
ParsedOneChar = true;
}
@@ -1079,7 +1081,7 @@
}
static bool IsPartOfWord(char c) {
- return (isalnum(c) || c == '-' || c == '_');
+ return (isalnum((unsigned char)c) || c == '-' || c == '_');
}
Check::FileCheckType &Check::FileCheckType::setCount(int C) {
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D79810.263522.patch
Type: text/x-patch
Size: 2634 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200512/3160e972/attachment-0001.bin>
More information about the llvm-commits
mailing list