[PATCH] [TableGen][AsmMatcherEmitter] Only parse tokens following separators as registers

Ahmed Bougacha ahmed.bougacha at gmail.com
Wed May 20 17:53:48 PDT 2015


In http://reviews.llvm.org/D9844#174841, @craig.topper wrote:

> Does this still allow the problem with a hypothetical string of   ss${cc}cmp?


Indeed it does;  we can look in both directions, see new patch.

- Look at all the separators, not just space and comma.
- Add the PR testcase.

There's a special case for '$$', because Mips uses '$$reg', as '$' is the RegisterPrefix, and the best way of escaping it is '$$'.  '\\$reg' will tokenize into separate '$' and 'reg', which isn't OK.
Changing that to not consider '\' a separator is a whole 'nother can of worms, as that leads to behavior changes on:

  "\\{$Vd, $dst2, $dst3, $dst4\\}"

where we now think "dst4\}" is a single token.


http://reviews.llvm.org/D9844

Files:
  test/MC/X86/intel-syntax.s
  utils/TableGen/AsmMatcherEmitter.cpp

Index: utils/TableGen/AsmMatcherEmitter.cpp
===================================================================
--- utils/TableGen/AsmMatcherEmitter.cpp
+++ utils/TableGen/AsmMatcherEmitter.cpp
@@ -310,11 +310,16 @@
     /// The suboperand index within SrcOpName, or -1 for the entire operand.
     int SubOpIdx;
 
+    /// Whether the token is "isolated", i.e., it is preceded and followed
+    /// by separators.
+    bool IsIsolatedToken;
+
     /// Register record if this token is singleton register.
     Record *SingletonReg;
 
-    explicit AsmOperand(StringRef T)
-        : Token(T), Class(nullptr), SubOpIdx(-1), SingletonReg(nullptr) {}
+    explicit AsmOperand(bool IsIsolatedToken, StringRef T)
+        : Token(T), Class(nullptr), SubOpIdx(-1),
+          IsIsolatedToken(IsIsolatedToken), SingletonReg(nullptr) {}
   };
 
   /// ResOperand - This represents a single operand in the result instruction
@@ -805,7 +810,14 @@
 
 void MatchableInfo::addAsmOperand(size_t Start, size_t End) {
   StringRef String = AsmString;
-  AsmOperands.push_back(AsmOperand(String.slice(Start, End)));
+  StringRef Separators = "[]*! \t,";
+  // Look for separators before and after to figure out is this token is
+  // isolated.  Accept '$$' as that's how we escape '$'.
+  bool IsIsolatedToken =
+      (!Start || Separators.find(String[Start - 1]) != StringRef::npos ||
+       String.substr(Start - 1, 2) == "$$") &&
+      (End >= String.size() || Separators.find(String[End]) != StringRef::npos);
+  AsmOperands.push_back(AsmOperand(IsIsolatedToken, String.slice(Start, End)));
 }
 
 /// tokenizeAsmString - Tokenize a simplified assembly string.
@@ -960,6 +972,11 @@
                                       std::string &RegisterPrefix) {
   StringRef Tok = AsmOperands[OperandNo].Token;
 
+  // If this token is not an isolated token, i.e., it isn't separated from
+  // other tokens (e.g. with whitespace), don't interpret it as a register name.
+  if (!AsmOperands[OperandNo].IsIsolatedToken)
+    return;
+
   if (RegisterPrefix.empty()) {
     std::string LoweredTok = Tok.lower();
     if (const CodeGenRegister *Reg = Info.Target.getRegisterByName(LoweredTok))
@@ -1508,7 +1525,7 @@
       // Insert remaining suboperands after AsmOpIdx in II->AsmOperands.
       StringRef Token = Op->Token; // save this in case Op gets moved
       for (unsigned SI = 1, SE = Operands[Idx].MINumOperands; SI != SE; ++SI) {
-        MatchableInfo::AsmOperand NewAsmOp(Token);
+        MatchableInfo::AsmOperand NewAsmOp(/*IsIsolatedToken=*/true, Token);
         NewAsmOp.SubOpIdx = SI;
         II->AsmOperands.insert(II->AsmOperands.begin()+AsmOpIdx+SI, NewAsmOp);
       }
Index: test/MC/X86/intel-syntax.s
===================================================================
--- test/MC/X86/intel-syntax.s
+++ test/MC/X86/intel-syntax.s
@@ -662,3 +662,6 @@
 // CHECK: fnsave (%eax)
 // CHECK: fxrstor (%eax)
 // CHECK: frstor (%eax)
+
+// CHECK: cmpnless %xmm1, %xmm0
+cmpnless xmm0, xmm1

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D9844.26198.patch
Type: text/x-patch
Size: 2987 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150521/1f55109a/attachment.bin>


More information about the llvm-commits mailing list