[llvm-commits] new patches for compiling linux with integrated-as

David Woodhouse dwmw2 at infradead.org
Fri Jan 31 04:34:33 PST 2014


On Sun, 2012-11-04 at 18:57 +0200, Pax Team wrote:
> 
> here's the latest batch of patches that are needed for compiling linux (the kernel).
> 
> - .octa support: this is a GNU as keyword used by some crypto code in linux.
>   nothing changed since the last submission last spring, would be nice if someone
>   looked at it finally ;). as i said back then, this is a minimal implementation,
>   full 128 bit expression support would require way too many changes (APInt would
>   have to be used throughout in MCExpr).

I've updated this and added test cases. However, when I look closer I
see that you've only partially extended AsmLexer::LexDigit() to cope
with 128-bit values.

I could live with being limited to 64-bit for decimal constants, and in
fact had fixed up your .octa implementation to cope — previously if
given '.octa 12345' with a small decimal constant it would emit 12345 as
*both* the high and low quadword, instead of zero as the high one.

But with the patch as it stands we would be left in the rather strange
position that
	.octa 0x123456789123456789
would work, while
	.octa 123456789123456789h
would not.

I think you need to fix *all* integer parsing to cope with 128-bit
values, surely?

There are also a bunch of places which just use getIntVal() on an
AsmToken::Integer token and assume that the result fits in an int64_t
without checking it. Won't you need to fix those too, even if you've
only allowed the 0x... case to be 128-bit? And perhaps add test cases
for the various 'wtf this is too big' cases that would introduce?

Maybe, in fact, we should look at magically metamorphosing the
AsmToken::Integer tokens into a new type AsmToken::BigNum if the value
is larger than 64 bits? That might make the whole thing a lot simpler
than updating *all* users of AsmToken::Integer to cope with the fact
that it might now be larger.


I can certainly cope with being limited to constants instead of full
expressions, since that accounts for all the usage I know of today. We
can always revisit this and look at using APInt throughout MCExpr if
there is demand for that.

From 595498f5b19beb8df5deec26b02f115bde00fe92 Mon Sep 17 00:00:00 2001
From: David Woodhouse <David.Woodhouse at intel.com>
Date: Fri, 31 Jan 2014 11:20:58 +0000
Subject: [PATCH] MC: Add support for .octa

This is a minimal implementation which accepts only constants rather than
full expressions, but that should be perfectly sufficient for all known
users for now.

Patch from PaX Team <pageexec at freemail.hu>
---
 include/llvm/MC/MCParser/MCAsmLexer.h | 12 ++++++--
 lib/MC/MCParser/AsmLexer.cpp          |  4 +--
 lib/MC/MCParser/AsmParser.cpp         | 56 +++++++++++++++++++++++++++++++++--
 test/MC/AsmParser/directive_values.s  |  8 +++++
 4 files changed, 74 insertions(+), 6 deletions(-)

diff --git a/include/llvm/MC/MCParser/MCAsmLexer.h b/include/llvm/MC/MCParser/MCAsmLexer.h
index 8edf3a4..dd9b683 100644
--- a/include/llvm/MC/MCParser/MCAsmLexer.h
+++ b/include/llvm/MC/MCParser/MCAsmLexer.h
@@ -10,6 +10,7 @@
 #ifndef LLVM_MC_MCPARSER_MCASMLEXER_H
 #define LLVM_MC_MCPARSER_MCASMLEXER_H
 
+#include "llvm/ADT/APInt.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/Support/Compiler.h"
 #include "llvm/Support/DataTypes.h"
@@ -57,12 +58,14 @@ private:
   /// a memory buffer owned by the source manager.
   StringRef Str;
 
-  int64_t IntVal;
+  APInt IntVal;
 
 public:
   AsmToken() {}
-  AsmToken(TokenKind _Kind, StringRef _Str, int64_t _IntVal = 0)
+  AsmToken(TokenKind _Kind, StringRef _Str, APInt _IntVal)
     : Kind(_Kind), Str(_Str), IntVal(_IntVal) {}
+  AsmToken(TokenKind _Kind, StringRef _Str, int64_t _IntVal = 0)
+    : Kind(_Kind), Str(_Str), IntVal(64, _IntVal, true) {}
 
   TokenKind getKind() const { return Kind; }
   bool is(TokenKind K) const { return Kind == K; }
@@ -99,6 +102,11 @@ public:
   // as a single token, then diagnose as an invalid number).
   int64_t getIntVal() const {
     assert(Kind == Integer && "This token isn't an integer!");
+    return IntVal.getZExtValue();
+  }
+
+  APInt getAPIntVal() const {
+    assert(Kind == Integer && "This token isn't an integer!");
     return IntVal;
   }
 };
diff --git a/lib/MC/MCParser/AsmLexer.cpp b/lib/MC/MCParser/AsmLexer.cpp
index ed98f93..b0de5e0 100644
--- a/lib/MC/MCParser/AsmLexer.cpp
+++ b/lib/MC/MCParser/AsmLexer.cpp
@@ -324,7 +324,7 @@ AsmToken AsmLexer::LexDigit() {
     if (CurPtr == NumStart)
       return ReturnError(CurPtr-2, "invalid hexadecimal number");
 
-    unsigned long long Result;
+    APInt Result(128, 0);
     if (StringRef(TokStart, CurPtr - TokStart).getAsInteger(0, Result))
       return ReturnError(TokStart, "invalid hexadecimal number");
 
@@ -337,7 +337,7 @@ AsmToken AsmLexer::LexDigit() {
     SkipIgnoredIntegerSuffix(CurPtr);
 
     return AsmToken(AsmToken::Integer, StringRef(TokStart, CurPtr - TokStart),
-                    (int64_t)Result);
+                    Result);
   }
 
   // Either octal or hexadecimal.
diff --git a/lib/MC/MCParser/AsmParser.cpp b/lib/MC/MCParser/AsmParser.cpp
index 3f813a7..7a2760d 100644
--- a/lib/MC/MCParser/AsmParser.cpp
+++ b/lib/MC/MCParser/AsmParser.cpp
@@ -337,8 +337,8 @@ private:
   enum DirectiveKind {
     DK_NO_DIRECTIVE, // Placeholder
     DK_SET, DK_EQU, DK_EQUIV, DK_ASCII, DK_ASCIZ, DK_STRING, DK_BYTE, DK_SHORT,
-    DK_VALUE, DK_2BYTE, DK_LONG, DK_INT, DK_4BYTE, DK_QUAD, DK_8BYTE, DK_SINGLE,
-    DK_FLOAT, DK_DOUBLE, DK_ALIGN, DK_ALIGN32, DK_BALIGN, DK_BALIGNW,
+    DK_VALUE, DK_2BYTE, DK_LONG, DK_INT, DK_4BYTE, DK_QUAD, DK_8BYTE, DK_OCTA,
+    DK_SINGLE, DK_FLOAT, DK_DOUBLE, DK_ALIGN, DK_ALIGN32, DK_BALIGN, DK_BALIGNW,
     DK_BALIGNL, DK_P2ALIGN, DK_P2ALIGNW, DK_P2ALIGNL, DK_ORG, DK_FILL, DK_ENDR,
     DK_BUNDLE_ALIGN_MODE, DK_BUNDLE_LOCK, DK_BUNDLE_UNLOCK,
     DK_ZERO, DK_EXTERN, DK_GLOBL, DK_GLOBAL,
@@ -367,6 +367,7 @@ private:
   // ".ascii", ".asciz", ".string"
   bool parseDirectiveAscii(StringRef IDVal, bool ZeroTerminated);
   bool parseDirectiveValue(unsigned Size); // ".byte", ".long", ...
+  bool parseDirectiveOctaValue(); // ".octa"
   bool parseDirectiveRealValue(const fltSemantics &); // ".single", ...
   bool parseDirectiveFill(); // ".fill"
   bool parseDirectiveZero(); // ".zero"
@@ -1374,6 +1375,8 @@ bool AsmParser::parseStatement(ParseStatementInfo &Info) {
     case DK_QUAD:
     case DK_8BYTE:
       return parseDirectiveValue(8);
+    case DK_OCTA:
+      return parseDirectiveOctaValue();
     case DK_SINGLE:
     case DK_FLOAT:
       return parseDirectiveRealValue(APFloat::IEEEsingle);
@@ -2308,6 +2311,54 @@ bool AsmParser::parseDirectiveValue(unsigned Size) {
   return false;
 }
 
+/// ParseDirectiveOctaValue
+///  ::= .octa [ hexconstant (, hexconstant)* ]
+bool AsmParser::parseDirectiveOctaValue() {
+  if (getLexer().isNot(AsmToken::EndOfStatement)) {
+    checkForValidSection();
+
+    for (;;) {
+      if (Lexer.getKind() == AsmToken::Error)
+        return true;
+      if (Lexer.getKind() != AsmToken::Integer)
+        return TokError("unknown token in expression");
+
+      SMLoc ExprLoc = getLexer().getLoc();
+      APInt IntValue = getTok().getAPIntVal();
+      Lex();
+
+      uint64_t hi, lo;
+      if (IntValue.isIntN(64)) {
+        hi = 0;
+        lo = IntValue.getZExtValue();
+      } else if (IntValue.isIntN(128)) {
+        hi = IntValue.getHiBits(64).getZExtValue();
+        lo = IntValue.getLoBits(64).getZExtValue();
+      } else
+        return Error(ExprLoc, "literal value out of range for directive");
+
+      if (MAI.isLittleEndian()) {
+        getStreamer().EmitIntValue(lo, 8);
+        getStreamer().EmitIntValue(hi, 8);
+      } else {
+        getStreamer().EmitIntValue(hi, 8);
+        getStreamer().EmitIntValue(lo, 8);
+      }
+
+      if (getLexer().is(AsmToken::EndOfStatement))
+        break;
+
+      // FIXME: Improve diagnostic.
+      if (getLexer().isNot(AsmToken::Comma))
+        return TokError("unexpected token in directive");
+      Lex();
+    }
+  }
+
+  Lex();
+  return false;
+}
+
 /// parseDirectiveRealValue
 ///  ::= (.single | .double) [ expression (, expression)* ]
 bool AsmParser::parseDirectiveRealValue(const fltSemantics &Semantics) {
@@ -3791,6 +3842,7 @@ void AsmParser::initializeDirectiveKindMap() {
   DirectiveKindMap[".4byte"] = DK_4BYTE;
   DirectiveKindMap[".quad"] = DK_QUAD;
   DirectiveKindMap[".8byte"] = DK_8BYTE;
+  DirectiveKindMap[".octa"] = DK_OCTA;
   DirectiveKindMap[".single"] = DK_SINGLE;
   DirectiveKindMap[".float"] = DK_FLOAT;
   DirectiveKindMap[".double"] = DK_DOUBLE;
diff --git a/test/MC/AsmParser/directive_values.s b/test/MC/AsmParser/directive_values.s
index ed932b2..686df32 100644
--- a/test/MC/AsmParser/directive_values.s
+++ b/test/MC/AsmParser/directive_values.s
@@ -69,3 +69,11 @@ TEST8:
         .long 0x200000L+1
 # CHECK: .long 2097153
 # CHECK: .long 2097153
+
+TEST9:
+	.octa 0x1234567812345678abcdef, 123456789
+# CHECK: TEST9
+# CHECK: .quad 8652035380128501231
+# CHECK: .quad 1193046
+# CHECK: .quad 123456789
+# CHECK: .quad 0
-- 
1.8.3.1



-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse at intel.com                              Intel Corporation
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5745 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140131/e93e221c/attachment.bin>


More information about the llvm-commits mailing list