[llvm] r196607 - Integrated assembler incorrectly lexes ARM-style comments

Jim Grosbach grosbach at apple.com
Fri Dec 6 13:33:29 PST 2013


Seems reasonable in general. One comment below.


On Dec 6, 2013, at 12:35 PM, David Peixotto <dpeixott at codeaurora.org> wrote:

> Author: dpeixott
> Date: Fri Dec  6 14:35:58 2013
> New Revision: 196607
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=196607&view=rev
> Log:
> Integrated assembler incorrectly lexes ARM-style comments
> 
> The integrated assembler fails to properly lex arm comments when
> they are adjacent to an identifier in the input stream. The reason
> is that the arm comment symbol '@' is also used as symbol variant in
> other assembly languages so when lexing an identifier it allows the
> '@' symbol as part of the identifier.
> 
> Example:
>  $ cat comment.s
>  foo:
>    add r0, r0 at got to parse this as a comment
> 
>  $ llvm-mc -triple armv7 comment.s
>  comment.s:4:18: error: unexpected token in argument list
>    add r0, r0 at got to parse this as a comment
>                   ^
> This should be parsed as correctly as `add r0, r0`.
> 
> This commit modifes the assembly lexer to not include the '@' symbol
> in identifiers when lexing for targets that use '@' for comments.
> 
> Added:
>    llvm/trunk/test/MC/ARM/comment.s
> Modified:
>    llvm/trunk/lib/MC/MCParser/AsmLexer.cpp
> 
> Modified: llvm/trunk/lib/MC/MCParser/AsmLexer.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCParser/AsmLexer.cpp?rev=196607&r1=196606&r2=196607&view=diff
> ==============================================================================
> --- llvm/trunk/lib/MC/MCParser/AsmLexer.cpp (original)
> +++ llvm/trunk/lib/MC/MCParser/AsmLexer.cpp Fri Dec  6 14:35:58 2013
> @@ -139,20 +139,23 @@ AsmToken AsmLexer::LexHexFloatLiteral(bo
> }
> 
> /// LexIdentifier: [a-zA-Z_.][a-zA-Z0-9_$.@?]*
> -static bool IsIdentifierChar(char c) {
> -  return isalnum(c) || c == '_' || c == '$' || c == '.' || c == '@' || c == '?';
> +static bool IsIdentifierChar(char c, bool AllowAt) {
> +  return isalnum(c) || c == '_' || c == '$' || c == '.' ||
> +         (c == '@' && AllowAt) || c == '?';
> }
> AsmToken AsmLexer::LexIdentifier() {
> +  bool AllowAtInIdentifier = !StringRef(MAI.getCommentString()).startswith("@“);

Perhaps move this bool to a class variable initialized early so we don’t have to query the MAI directly every identifier?

>   // Check for floating point literals.
>   if (CurPtr[-1] == '.' && isdigit(*CurPtr)) {
>     // Disambiguate a .1243foo identifier from a floating literal.
>     while (isdigit(*CurPtr))
>       ++CurPtr;
> -    if (*CurPtr == 'e' || *CurPtr == 'E' || !IsIdentifierChar(*CurPtr))
> +    if (*CurPtr == 'e' || *CurPtr == 'E' ||
> +        !IsIdentifierChar(*CurPtr, AllowAtInIdentifier))
>       return LexFloatLiteral();
>   }
> 
> -  while (IsIdentifierChar(*CurPtr))
> +  while (IsIdentifierChar(*CurPtr, AllowAtInIdentifier))
>     ++CurPtr;
> 
>   // Handle . as a special case.
> 
> Added: llvm/trunk/test/MC/ARM/comment.s
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/ARM/comment.s?rev=196607&view=auto
> ==============================================================================
> --- llvm/trunk/test/MC/ARM/comment.s (added)
> +++ llvm/trunk/test/MC/ARM/comment.s Fri Dec  6 14:35:58 2013
> @@ -0,0 +1,24 @@
> +@ Tests to check that '@' does not get lexed as an identifier for arm
> +@ RUN: llvm-mc %s -triple=armv7-linux-gnueabi  | FileCheck %s
> +@ RUN: llvm-mc %s -triple=armv7-linux-gnueabi 2>&1 | FileCheck %s --check-prefix=ERROR
> +
> +foo:
> +  bl boo at plt should be ignored
> +  bl goo at plt
> +  .long bar at got to parse this as a comment
> +  .long baz at got
> +  add r0, r0 at ignore this extra junk
> +
> + at CHECK-LABEL: foo:
> + at CHECK: bl boo
> + at CHECK-NOT: @
> + at CHECK: bl goo
> + at CHECK-NOT: @
> + at CHECK: .long bar
> + at CHECK-NOT: @
> + at CHECK: .long baz
> + at CHECK-NOT: @
> + at CHECK: add r0, r0
> + at CHECK-NOT: @
> +
> + at ERROR-NOT: error:
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits





More information about the llvm-commits mailing list