[PATCH] [TableGen][AsmMatcherEmitter] Only parse tokens following separators as registers
Ahmed Bougacha
ahmed.bougacha at gmail.com
Wed May 20 17:53:48 PDT 2015
In http://reviews.llvm.org/D9844#174841, @craig.topper wrote:
> Does this still allow the problem with a hypothetical string of ss${cc}cmp?
Indeed it does; we can look in both directions, see new patch.
- Look at all the separators, not just space and comma.
- Add the PR testcase.
There's a special case for '$$', because Mips uses '$$reg', as '$' is the RegisterPrefix, and the best way of escaping it is '$$'. '\\$reg' will tokenize into separate '$' and 'reg', which isn't OK.
Changing that to not consider '\' a separator is a whole 'nother can of worms, as that leads to behavior changes on:
"\\{$Vd, $dst2, $dst3, $dst4\\}"
where we now think "dst4\}" is a single token.
http://reviews.llvm.org/D9844
Files:
test/MC/X86/intel-syntax.s
utils/TableGen/AsmMatcherEmitter.cpp
Index: utils/TableGen/AsmMatcherEmitter.cpp
===================================================================
--- utils/TableGen/AsmMatcherEmitter.cpp
+++ utils/TableGen/AsmMatcherEmitter.cpp
@@ -310,11 +310,16 @@
/// The suboperand index within SrcOpName, or -1 for the entire operand.
int SubOpIdx;
+ /// Whether the token is "isolated", i.e., it is preceded and followed
+ /// by separators.
+ bool IsIsolatedToken;
+
/// Register record if this token is singleton register.
Record *SingletonReg;
- explicit AsmOperand(StringRef T)
- : Token(T), Class(nullptr), SubOpIdx(-1), SingletonReg(nullptr) {}
+ explicit AsmOperand(bool IsIsolatedToken, StringRef T)
+ : Token(T), Class(nullptr), SubOpIdx(-1),
+ IsIsolatedToken(IsIsolatedToken), SingletonReg(nullptr) {}
};
/// ResOperand - This represents a single operand in the result instruction
@@ -805,7 +810,14 @@
void MatchableInfo::addAsmOperand(size_t Start, size_t End) {
StringRef String = AsmString;
- AsmOperands.push_back(AsmOperand(String.slice(Start, End)));
+ StringRef Separators = "[]*! \t,";
+ // Look for separators before and after to figure out is this token is
+ // isolated. Accept '$$' as that's how we escape '$'.
+ bool IsIsolatedToken =
+ (!Start || Separators.find(String[Start - 1]) != StringRef::npos ||
+ String.substr(Start - 1, 2) == "$$") &&
+ (End >= String.size() || Separators.find(String[End]) != StringRef::npos);
+ AsmOperands.push_back(AsmOperand(IsIsolatedToken, String.slice(Start, End)));
}
/// tokenizeAsmString - Tokenize a simplified assembly string.
@@ -960,6 +972,11 @@
std::string &RegisterPrefix) {
StringRef Tok = AsmOperands[OperandNo].Token;
+ // If this token is not an isolated token, i.e., it isn't separated from
+ // other tokens (e.g. with whitespace), don't interpret it as a register name.
+ if (!AsmOperands[OperandNo].IsIsolatedToken)
+ return;
+
if (RegisterPrefix.empty()) {
std::string LoweredTok = Tok.lower();
if (const CodeGenRegister *Reg = Info.Target.getRegisterByName(LoweredTok))
@@ -1508,7 +1525,7 @@
// Insert remaining suboperands after AsmOpIdx in II->AsmOperands.
StringRef Token = Op->Token; // save this in case Op gets moved
for (unsigned SI = 1, SE = Operands[Idx].MINumOperands; SI != SE; ++SI) {
- MatchableInfo::AsmOperand NewAsmOp(Token);
+ MatchableInfo::AsmOperand NewAsmOp(/*IsIsolatedToken=*/true, Token);
NewAsmOp.SubOpIdx = SI;
II->AsmOperands.insert(II->AsmOperands.begin()+AsmOpIdx+SI, NewAsmOp);
}
Index: test/MC/X86/intel-syntax.s
===================================================================
--- test/MC/X86/intel-syntax.s
+++ test/MC/X86/intel-syntax.s
@@ -662,3 +662,6 @@
// CHECK: fnsave (%eax)
// CHECK: fxrstor (%eax)
// CHECK: frstor (%eax)
+
+// CHECK: cmpnless %xmm1, %xmm0
+cmpnless xmm0, xmm1
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D9844.26198.patch
Type: text/x-patch
Size: 2987 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150521/1f55109a/attachment.bin>
More information about the llvm-commits
mailing list