[PATCH] D99889: [AsmParser][ms][X86] Fix possible misbehaviour in parsing of special tokens at start of string.
Anirudh Prasad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Apr 5 16:42:46 PDT 2021
anirudhp added a comment.
> It looks like this adds support for X86 identifiers starting with $ (I'm actually surprised this wasn't allowed before).
Well. Yes and no. If you take a look at the `parseIdentifier` routine in AsmParser.cpp, there is an edge case which covers when tokens start with @ and $, but there are conditions associated with that edge case. Namely, if you have a string which starts with [@|$] and the next token is an Identifier or an Integer, then you parse as an Integer. So if you have something like $abc, and you make a call to the `parseIdentifier` routine, then "$abc" will be parsed as a "whole" Identifier, even though, the lexer lexes it as {AsmToken::At, AsmToken::Identifier}. But if you have something like "$$abc", then a call to the `parseIdentifier` routine will "fail" (and in the case the lexer lexes it as {AsmToken::At, AsmToken::At, AsmToken::Identifier}.
An argument could be made that the parseIdentifier routine could be modified or overridden wherever appropriate and given custom logic.
Part of this patch is also exploratory. The way I interpreted the comment `allow $ @ ? characters at the start of symbol names. Defaults to false.`, told me that if "$", "@", "?" occurred at the start of the string, it should be lexed as part of an Identifier. However, now I don't know if that comment was meant to be written taking into account the `parseIdentifier` routine. In that case, the X86MCAsmInfo backend, should only set `AllowQuestionTokenAtStartOfString` to true. The other two are not "required" since `parseIdentifier` would "take care" of it. I'm also not too familiar with the ms semantics :)
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D99889/new/
https://reviews.llvm.org/D99889
More information about the llvm-commits
mailing list