[PATCH] D38545: [MC] - llvm-mc hangs on non-english characters.

George Rimar via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 4 08:47:18 PDT 2017


grimar created this revision.

This fixes PR33255.

Currently llvm-mc just hangs inside infinite loop while trying to parse file
which has ".section .с" inside, where section name is non-english character.

In this patch I also moved content of `non-english-characters.s` to test/MC/AsmParser/Inputs folder 
so that `non-english-characters.s` becomes a single testcase for all invalid inputs
containing non-english symbols. That is convinent because llvm-mc otherwise tries
to parse and tokenize the whole testcase file with tools invocations and it is harder to isolate the issue.


https://reviews.llvm.org/D38545

Files:
  lib/MC/MCParser/ELFAsmParser.cpp
  test/MC/AsmParser/Inputs/non-english-characters-comments.s
  test/MC/AsmParser/Inputs/non-english-characters-section-name.s
  test/MC/AsmParser/non-english-characters.s


Index: test/MC/AsmParser/non-english-characters.s
===================================================================
--- test/MC/AsmParser/non-english-characters.s
+++ test/MC/AsmParser/non-english-characters.s
@@ -1,14 +1,9 @@
-# RUN: llvm-mc -triple i386-linux-gnu -filetype=obj -o %t %s
+# RUN: llvm-mc -triple i386-linux-gnu -filetype=obj -o %t \
+# RUN:   %S/Inputs/non-english-characters-comments.s
 # RUN: llvm-readobj %t | FileCheck %s
 # CHECK: Format: ELF32-i386
 
-# 0bム
-# 0xム
-# .ム4
-# .Xム
-# .1ム
-# .1eム
-# 0x.ム
-# 0x0pム
-.intel_syntax
-# 1ム
+# RUN: not llvm-mc -triple i386-linux-gnu -filetype=obj -o %t \
+# RUN:   %S/Inputs/non-english-characters-section-name.s 2>&1 | \
+# RUN:     FileCheck %s --check-prefix=ERR
+# ERR: invalid character in input
Index: test/MC/AsmParser/Inputs/non-english-characters-section-name.s
===================================================================
--- test/MC/AsmParser/Inputs/non-english-characters-section-name.s
+++ test/MC/AsmParser/Inputs/non-english-characters-section-name.s
@@ -0,0 +1 @@
+.section .ñ
Index: test/MC/AsmParser/Inputs/non-english-characters-comments.s
===================================================================
--- test/MC/AsmParser/Inputs/non-english-characters-comments.s
+++ test/MC/AsmParser/Inputs/non-english-characters-comments.s
@@ -0,0 +1,10 @@
+# 0bム
+# 0xム
+# .ム4
+# .Xム
+# .1ム
+# .1eム
+# 0x.ム
+# 0x0pム
+.intel_syntax
+# 1ム
Index: lib/MC/MCParser/ELFAsmParser.cpp
===================================================================
--- lib/MC/MCParser/ELFAsmParser.cpp
+++ lib/MC/MCParser/ELFAsmParser.cpp
@@ -247,7 +247,7 @@
     return false;
   }
 
-  while (true) {
+  while (!getParser().hasPendingError()) {
     SMLoc PrevLoc = getLexer().getLoc();
     if (getLexer().is(AsmToken::Comma) ||
       getLexer().is(AsmToken::EndOfStatement))


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D38545.117676.patch
Type: text/x-patch
Size: 1885 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20171004/8fb25643/attachment.bin>


More information about the llvm-commits mailing list