[RFC] Make x86 AsmParser more conservative about acceptable registers for memory addressing.

David Woodhouse dwmw2 at infradead.org
Mon Jan 6 03:47:10 PST 2014


There is precedent (see r153166) for accepting nonsense in the AsmParser
and then just bailing out with an assert() in the code emitter. So I
blithely added a few more cases where this can happen, in r198566.

Although I was prepared to ignore it for the time being, this does suck.
So here's an attempt at making the AsmParser more conservative. I took
the naïve approach of adding the extra validation alongside the existing
validation (such as there is) that happens in ParseMemOperand().

However, there are a number of issues with this approach. Firstly, I
have to accept (%dx) as a "memory" operand because that might be used
for an unofficial form of in/out instructions. In ParseMemOperand() we
don't know what the instruction was, so we can't easily accept it *only*
for those cases... and there are conceptual issues with doing it that
way even if we were to make the information available.

Secondly, it looks like the validation doesn't happen for Intel-syntax
parsing; only for AT&T syntax.

Both of these issues could perhaps be solved by doing the
register-validation at a *later* stage, perhaps right after each call to
ParseOperand() at about line 2085/2096/2108 in ParseInstruction(). Or
perhaps in ParseOperand() itself, if we make sufficient information
available to it.

I'd like to get feedback before I spend time actually doing it that way
though... here's the simple version:

diff --git a/lib/Target/X86/AsmParser/X86AsmParser.cpp b/lib/Target/X86/AsmParser/X86AsmParser.cpp
index 86ba185..e64e23d 100644
--- a/lib/Target/X86/AsmParser/X86AsmParser.cpp
+++ b/lib/Target/X86/AsmParser/X86AsmParser.cpp
@@ -1853,10 +1853,11 @@ X86Operand *X86AsmParser::ParseMemOperand(unsigned SegReg, SMLoc MemStart) {
   // If we reached here, then we just ate the ( of the memory operand.  Process
   // the rest of the memory operand.
   unsigned BaseReg = 0, IndexReg = 0, Scale = 1;
-  SMLoc IndexLoc;
+  SMLoc IndexLoc, BaseLoc;
 
   if (getLexer().is(AsmToken::Percent)) {
     SMLoc StartLoc, EndLoc;
+    BaseLoc = Parser.getTok().getLoc();
     if (ParseRegister(BaseReg, StartLoc, EndLoc)) return 0;
     if (BaseReg == X86::EIZ || BaseReg == X86::RIZ) {
       Error(StartLoc, "eiz and riz can only be used as index registers",
@@ -1899,6 +1900,11 @@ X86Operand *X86AsmParser::ParseMemOperand(unsigned SegReg, SMLoc MemStart) {
           }
 
           // Validate the scale amount.
+	  if (X86MCRegisterClasses[X86::GR16RegClassID].contains(BaseReg) &&
+              ScaleVal != 1) {
+            Error(Loc, "scale factor in 16-bit address must be 1");
+            return 0;
+	  }
           if (ScaleVal != 1 && ScaleVal != 2 && ScaleVal != 4 && ScaleVal != 8){
             Error(Loc, "scale factor in address must be 1, 2, 4 or 8");
             return 0;
@@ -1929,6 +1935,21 @@ X86Operand *X86AsmParser::ParseMemOperand(unsigned SegReg, SMLoc MemStart) {
   SMLoc MemEnd = Parser.getTok().getEndLoc();
   Parser.Lex(); // Eat the ')'.
 
+  // Check for use of invalid 16-bit registers. Only BX/BP/SI/DI are allowed,
+  // and then only in non-64-bit modes. Except for DX, which is a special case
+  // because an unofficial form of in/out instructions uses it.
+  if (X86MCRegisterClasses[X86::GR16RegClassID].contains(BaseReg) &&
+      (is64BitMode() || (BaseReg != X86::BX && BaseReg != X86::BP &&
+                         BaseReg != X86::SI && BaseReg != X86::DI)) &&
+      BaseReg != X86::DX) {
+    Error(BaseLoc, "invalid 16-bit base register");
+    return 0;
+  }
+  if (BaseReg == 0 &&
+      X86MCRegisterClasses[X86::GR16RegClassID].contains(IndexReg)) {
+    Error(IndexLoc, "16-bit memory operand may not include only index register");
+    return 0;
+  }
   // If we have both a base register and an index register make sure they are
   // both 64-bit or 32-bit registers.
   // To support VSIB, IndexReg can be 128-bit or 256-bit registers.
@@ -1937,16 +1958,30 @@ X86Operand *X86AsmParser::ParseMemOperand(unsigned SegReg, SMLoc MemStart) {
         (X86MCRegisterClasses[X86::GR16RegClassID].contains(IndexReg) ||
          X86MCRegisterClasses[X86::GR32RegClassID].contains(IndexReg)) &&
         IndexReg != X86::RIZ) {
-      Error(IndexLoc, "index register is 32-bit, but base register is 64-bit");
+      Error(BaseLoc, "base register is 64-bit, but index register is not");
       return 0;
     }
     if (X86MCRegisterClasses[X86::GR32RegClassID].contains(BaseReg) &&
         (X86MCRegisterClasses[X86::GR16RegClassID].contains(IndexReg) ||
          X86MCRegisterClasses[X86::GR64RegClassID].contains(IndexReg)) &&
         IndexReg != X86::EIZ){
-      Error(IndexLoc, "index register is 64-bit, but base register is 32-bit");
+      Error(BaseLoc, "base register is 32-bit, but index register is not");
       return 0;
     }
+    if (X86MCRegisterClasses[X86::GR16RegClassID].contains(BaseReg)) {
+      if (X86MCRegisterClasses[X86::GR32RegClassID].contains(IndexReg) ||
+          X86MCRegisterClasses[X86::GR64RegClassID].contains(IndexReg)) {
+        Error(BaseLoc, "base register is 16-bit, but index register is not");
+        return 0;
+      }
+      if (((BaseReg == X86::BX || BaseReg == X86::BP) &&
+           IndexReg != X86::SI && IndexReg != X86::DI) ||
+          ((BaseReg == X86::SI || BaseReg == X86::DI) &&
+           IndexReg != X86::BX && IndexReg != X86::BP)) {
+        Error(BaseLoc, "invalid 16-bit base/index register combination");
+        return 0;
+      }
+    }
   }
 
   return X86Operand::CreateMem(SegReg, Disp, BaseReg, IndexReg, Scale,
diff --git a/test/MC/X86/x86_errors.s b/test/MC/X86/x86_errors.s
index a974233..51f2e8e 100644
--- a/test/MC/X86/x86_errors.s
+++ b/test/MC/X86/x86_errors.s
@@ -26,8 +26,23 @@ sysexitq
 lea (%rsp, %rbp, $4), %rax
 
 // rdar://10423777
-// 64: error: index register is 32-bit, but base register is 64-bit
+// 64: error: base register is 64-bit, but index register is not
 movq (%rsi,%ecx),%xmm0
 
+// 64: error: invalid 16-bit base register
+movl %eax,(%bp,%si)
+
+// 32: error: scale factor in 16-bit address must be 1
+movl %eax,(%bp,%si,2)
+
+// 32: error: invalid 16-bit base register
+movl %eax,(%cx)
+
+// 32: error: invalid 16-bit base/index register combination
+movl %eax,(%bp,%bx)
+
+// 32: error: 16-bit memory operand may not include only index register
+movl %eax,(,%bx)
+
 // 32: error: invalid operand for instruction
 outb al, 4



-- 
dwmw2

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5745 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140106/da24b58e/attachment.bin>


More information about the llvm-commits mailing list