[llvm] r196607 - Integrated assembler incorrectly lexes ARM-style comments
David Peixotto
dpeixott at codeaurora.org
Fri Dec 6 12:35:58 PST 2013
Author: dpeixott
Date: Fri Dec 6 14:35:58 2013
New Revision: 196607
URL: http://llvm.org/viewvc/llvm-project?rev=196607&view=rev
Log:
Integrated assembler incorrectly lexes ARM-style comments
The integrated assembler fails to properly lex arm comments when
they are adjacent to an identifier in the input stream. The reason
is that the arm comment symbol '@' is also used as symbol variant in
other assembly languages so when lexing an identifier it allows the
'@' symbol as part of the identifier.
Example:
$ cat comment.s
foo:
add r0, r0 at got to parse this as a comment
$ llvm-mc -triple armv7 comment.s
comment.s:4:18: error: unexpected token in argument list
add r0, r0 at got to parse this as a comment
^
This should be parsed as correctly as `add r0, r0`.
This commit modifes the assembly lexer to not include the '@' symbol
in identifiers when lexing for targets that use '@' for comments.
Added:
llvm/trunk/test/MC/ARM/comment.s
Modified:
llvm/trunk/lib/MC/MCParser/AsmLexer.cpp
Modified: llvm/trunk/lib/MC/MCParser/AsmLexer.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCParser/AsmLexer.cpp?rev=196607&r1=196606&r2=196607&view=diff
==============================================================================
--- llvm/trunk/lib/MC/MCParser/AsmLexer.cpp (original)
+++ llvm/trunk/lib/MC/MCParser/AsmLexer.cpp Fri Dec 6 14:35:58 2013
@@ -139,20 +139,23 @@ AsmToken AsmLexer::LexHexFloatLiteral(bo
}
/// LexIdentifier: [a-zA-Z_.][a-zA-Z0-9_$.@?]*
-static bool IsIdentifierChar(char c) {
- return isalnum(c) || c == '_' || c == '$' || c == '.' || c == '@' || c == '?';
+static bool IsIdentifierChar(char c, bool AllowAt) {
+ return isalnum(c) || c == '_' || c == '$' || c == '.' ||
+ (c == '@' && AllowAt) || c == '?';
}
AsmToken AsmLexer::LexIdentifier() {
+ bool AllowAtInIdentifier = !StringRef(MAI.getCommentString()).startswith("@");
// Check for floating point literals.
if (CurPtr[-1] == '.' && isdigit(*CurPtr)) {
// Disambiguate a .1243foo identifier from a floating literal.
while (isdigit(*CurPtr))
++CurPtr;
- if (*CurPtr == 'e' || *CurPtr == 'E' || !IsIdentifierChar(*CurPtr))
+ if (*CurPtr == 'e' || *CurPtr == 'E' ||
+ !IsIdentifierChar(*CurPtr, AllowAtInIdentifier))
return LexFloatLiteral();
}
- while (IsIdentifierChar(*CurPtr))
+ while (IsIdentifierChar(*CurPtr, AllowAtInIdentifier))
++CurPtr;
// Handle . as a special case.
Added: llvm/trunk/test/MC/ARM/comment.s
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/ARM/comment.s?rev=196607&view=auto
==============================================================================
--- llvm/trunk/test/MC/ARM/comment.s (added)
+++ llvm/trunk/test/MC/ARM/comment.s Fri Dec 6 14:35:58 2013
@@ -0,0 +1,24 @@
+@ Tests to check that '@' does not get lexed as an identifier for arm
+@ RUN: llvm-mc %s -triple=armv7-linux-gnueabi | FileCheck %s
+@ RUN: llvm-mc %s -triple=armv7-linux-gnueabi 2>&1 | FileCheck %s --check-prefix=ERROR
+
+foo:
+ bl boo at plt should be ignored
+ bl goo at plt
+ .long bar at got to parse this as a comment
+ .long baz at got
+ add r0, r0 at ignore this extra junk
+
+ at CHECK-LABEL: foo:
+ at CHECK: bl boo
+ at CHECK-NOT: @
+ at CHECK: bl goo
+ at CHECK-NOT: @
+ at CHECK: .long bar
+ at CHECK-NOT: @
+ at CHECK: .long baz
+ at CHECK-NOT: @
+ at CHECK: add r0, r0
+ at CHECK-NOT: @
+
+ at ERROR-NOT: error:
More information about the llvm-commits
mailing list