+
+
+ Secure Virtual Architecture: A Safe Execution Environment for
+ Commodity Operating Systems
+
+
+
+
+ Secure Virtual Architecture: A Safe Execution Environment for
+ Commodity Operating Systems
+
+
+ John Criswell,
+ Andrew Lenharth,
+ Dinakar Dhurjati, and
+ Vikram Adve
+
+
+
Abstract:
+
+ This paper describes an efficient and robust
+ approach to provide a safe execution environment for an entire
+ operating system, such as Linux, and all its applications. The
+ approach, which we call Secure Virtual Architecture (SVA),
+ defines a virtual, low-level, typed instruction set suitable for
+ executing all code on a system, including kernel and
+ application code. SVA code is translated for execution by a virtual
+ machine transparently, offline or online.
+ SVA aims to enforce fine-grained (object level) memory safety,
+ control-flow integrity,
+ type safety for a subset of objects, and sound analysis.
+ A virtual machine implementing SVA achieves these goals by using a
+ novel approach that exploits properties of existing memory pools in
+ the kernel and by preserving the kernel's explicit control over
+ memory, including custom allocators and explicit deallocation.
+ Furthermore, the safety properties can be encoded compactly as
+ extensions to the SVA type system,
+ allowing the (complex) safety checking compiler to be outside
+ the trusted computing base. SVA also defines a set of OS interface
+ operations that abstract all privileged hardware instructions,
+ allowing the virtual machine to monitor all privileged operations
+ and control the physical resources on a given hardware platform.
+ We have ported the Linux kernel to SVA, treating it as a new
+ architecture, and made only minimal code changes (less than 300 lines of code)
+ to the machine-independent parts of the kernel and device drivers.
+ SVA is able to prevent 4 out of 5 memory safety exploits previously reported
+ for the Linux 2.4.22 kernel for which exploit code is available, and would
+ prevent the fifth one simply by compiling an additional kernel library.
+
+ Secure Virtual Architecture: A Safe Execution Environment for
+ Commodity Operating Systems
+
+
+ John Criswell,
+ Andrew Lenharth,
+ Dinakar Dhurjati, and
+ Vikram Adve
+
+
+
Abstract:
+
+ This paper describes an efficient and robust
+ approach to provide a safe execution environment for an entire
+ operating system, such as Linux, and all its applications. The
+ approach, which we call Secure Virtual Architecture (SVA),
+ defines a virtual, low-level, typed instruction set suitable for
+ executing all code on a system, including kernel and
+ application code. SVA code is translated for execution by a virtual
+ machine transparently, offline or online.
+ SVA aims to enforce fine-grained (object level) memory safety,
+ control-flow integrity,
+ type safety for a subset of objects, and sound analysis.
+ A virtual machine implementing SVA achieves these goals by using a
+ novel approach that exploits properties of existing memory pools in
+ the kernel and by preserving the kernel's explicit control over
+ memory, including custom allocators and explicit deallocation.
+ Furthermore, the safety properties can be encoded compactly as
+ extensions to the SVA type system,
+ allowing the (complex) safety checking compiler to be outside
+ the trusted computing base. SVA also defines a set of OS interface
+ operations that abstract all privileged hardware instructions,
+ allowing the virtual machine to monitor all privileged operations
+ and control the physical resources on a given hardware platform.
+ We have ported the Linux kernel to SVA, treating it as a new
+ architecture, and made only minimal code changes (less than 300 lines of code)
+ to the machine-independent parts of the kernel and device drivers.
+ SVA is able to prevent 4 out of 5 memory safety exploits previously reported
+ for the Linux 2.4.22 kernel for which exploit code is available, and would
+ prevent the fifth one simply by compiling an additional kernel library.
+
a bit container provides an efficient way to store and
+ perform set operations on sets of numeric id's, while automatically
+ eliminating duplicates. Bit containers require a maximum of 1 bit for each
+ identifier you want to store.
+
The BitVector container provides a fixed size set of bits for manipulation.
+It supports individual bit setting/testing, as well as set operations. The set
+operations take time O(size of bitvector), but operations are performed one word
+at a time, instead of one bit at a time. This makes the BitVector very fast for
+set operations compared to other containers. Use the BitVector when you expect
+the number of set bits to be high (IE a dense set).
+
The SparseBitVector container is much like BitVector, with one major
+difference: Only the bits that are set, are stored. This makes the
+SparseBitVector much more space efficient than BitVector when the set is sparse,
+as well as making set operations O(number of set bits) instead of O(size of
+universe). The downside to the SparseBitVector is that setting and testing of random bits is O(N), and on large SparseBitVectors, this can be slower than BitVector. In our implementation, setting or testing bits in sorted order
+(either forwards or reverse) is O(1) worst case. Testing and setting bits within 128 bits (depends on size) of the current bit is also O(1). As a general statement, testing/setting bits in a SparseBitVector is O(distance away from last set bit).
+
+
From baldrick at free.fr Mon Sep 24 11:59:54 2007
From: baldrick at free.fr (Duncan Sands)
Date: Mon, 24 Sep 2007 20:59:54 +0200
Subject: [llvm-commits] [llvm] r42266 -
/llvm/trunk/test/CFrontend/2007-09-17-WeakRef.c
In-Reply-To: <200709241714.l8OHEr6s025705@zion.cs.uiuc.edu>
References: <200709241714.l8OHEr6s025705@zion.cs.uiuc.edu>
Message-ID: <200709242059.56590.baldrick@free.fr>
Hi Tanya,
> XFAIL for llvm-gcc4.0
I'm pretty sure this only failed for llvm-gcc-4.2
but passed with 4.0... Did you mean 4.2 here?
Thanks,
Duncan.
From tonic at nondot.org Mon Sep 24 14:12:26 2007
From: tonic at nondot.org (Tanya M. Lattner)
Date: Mon, 24 Sep 2007 14:12:26 -0700 (PDT)
Subject: [llvm-commits] [llvm] r42266 -
/llvm/trunk/test/CFrontend/2007-09-17-WeakRef.c
In-Reply-To: <200709242059.56590.baldrick@free.fr>
References: <200709241714.l8OHEr6s025705@zion.cs.uiuc.edu>
<200709242059.56590.baldrick@free.fr>
Message-ID:
> I'm pretty sure this only failed for llvm-gcc-4.2
> but passed with 4.0... Did you mean 4.2 here?
No. This fails for me on llvm-gcc4.0.
-Tanya
>
> Thanks,
>
> Duncan.
>
From djg at cray.com Mon Sep 24 12:25:06 2007
From: djg at cray.com (Dan Gohman)
Date: Mon, 24 Sep 2007 19:25:06 -0000
Subject: [llvm-commits] [llvm] r42268 -
/llvm/trunk/lib/Target/X86/X86InstrInfo.td
Message-ID: <200709241925.l8OJP6Gp029149@zion.cs.uiuc.edu>
Author: djg
Date: Mon Sep 24 14:25:06 2007
New Revision: 42268
URL: http://llvm.org/viewvc/llvm-project?rev=42268&view=rev
Log:
Fix the syntax for the .loc directive in preparation for using it.
Modified:
llvm/trunk/lib/Target/X86/X86InstrInfo.td
Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.td
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.td?rev=42268&r1=42267&r2=42268&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86InstrInfo.td (original)
+++ llvm/trunk/lib/Target/X86/X86InstrInfo.td Mon Sep 24 14:25:06 2007
@@ -2434,7 +2434,7 @@
def DWARF_LOC : I<0, Pseudo, (outs),
(ins i32imm:$line, i32imm:$col, i32imm:$file),
- "; .loc ${file:debug}, ${line:debug}, ${col:debug}",
+ ".loc\t${file:debug} ${line:debug} ${col:debug}",
[(dwarf_loc (i32 imm:$line), (i32 imm:$col),
(i32 imm:$file))]>;
From dberlin at dberlin.org Mon Sep 24 12:45:50 2007
From: dberlin at dberlin.org (Daniel Berlin)
Date: Mon, 24 Sep 2007 19:45:50 -0000
Subject: [llvm-commits] [llvm] r42269 - in /llvm/trunk:
include/llvm/ADT/DenseMap.h include/llvm/ADT/SparseBitVector.h
lib/Analysis/IPA/Andersens.cpp
Message-ID: <200709241945.l8OJjo1M029802@zion.cs.uiuc.edu>
Author: dannyb
Date: Mon Sep 24 14:45:49 2007
New Revision: 42269
URL: http://llvm.org/viewvc/llvm-project?rev=42269&view=rev
Log:
Implement offline variable substitution in order to reduce memory
and time usage.
Fixup operator == to make this work, and add a resize method to DenseMap
so we can resize our hashtable once we know how big it should be.
Modified:
llvm/trunk/include/llvm/ADT/DenseMap.h
llvm/trunk/include/llvm/ADT/SparseBitVector.h
llvm/trunk/lib/Analysis/IPA/Andersens.cpp
Modified: llvm/trunk/include/llvm/ADT/DenseMap.h
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ADT/DenseMap.h?rev=42269&r1=42268&r2=42269&view=diff
==============================================================================
--- llvm/trunk/include/llvm/ADT/DenseMap.h (original)
+++ llvm/trunk/include/llvm/ADT/DenseMap.h Mon Sep 24 14:45:49 2007
@@ -99,6 +99,9 @@
bool empty() const { return NumEntries == 0; }
unsigned size() const { return NumEntries; }
+
+ /// Grow the densemap so that it has at least Size buckets. Does not shrink
+ void resize(size_t Size) { grow(Size); }
void clear() {
// If the capacity of the array is huge, and the # elements used is small,
@@ -228,7 +231,7 @@
// causing infinite loops in lookup.
if (NumEntries*4 >= NumBuckets*3 ||
NumBuckets-(NumEntries+NumTombstones) < NumBuckets/8) {
- this->grow();
+ this->grow(NumBuckets * 2);
LookupBucketFor(Key, TheBucket);
}
++NumEntries;
@@ -310,12 +313,13 @@
new (&Buckets[i].first) KeyT(EmptyKey);
}
- void grow() {
+ void grow(unsigned AtLeast) {
unsigned OldNumBuckets = NumBuckets;
BucketT *OldBuckets = Buckets;
// Double the number of buckets.
- NumBuckets <<= 1;
+ while (NumBuckets <= AtLeast)
+ NumBuckets <<= 1;
NumTombstones = 0;
Buckets = reinterpret_cast(new char[sizeof(BucketT)*NumBuckets]);
Modified: llvm/trunk/include/llvm/ADT/SparseBitVector.h
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ADT/SparseBitVector.h?rev=42269&r1=42268&r2=42269&view=diff
==============================================================================
--- llvm/trunk/include/llvm/ADT/SparseBitVector.h (original)
+++ llvm/trunk/include/llvm/ADT/SparseBitVector.h Mon Sep 24 14:45:49 2007
@@ -75,7 +75,6 @@
}
friend struct ilist_traits >;
-
public:
explicit SparseBitVectorElement(unsigned Idx) {
ElementIndex = Idx;
@@ -287,6 +286,14 @@
}
BecameZero = allzero;
}
+ // Get a hash value for this element;
+ uint64_t getHashValue() const {
+ uint64_t HashVal = 0;
+ for (unsigned i = 0; i < BITWORDS_PER_ELEMENT; ++i) {
+ HashVal ^= Bits[i];
+ }
+ return HashVal;
+ }
};
template
@@ -544,22 +551,20 @@
return false;
}
- bool operator!=(const SparseBitVector &RHS) {
+ bool operator!=(const SparseBitVector &RHS) const {
return !(*this == RHS);
}
- bool operator==(const SparseBitVector &RHS) {
+ bool operator==(const SparseBitVector &RHS) const {
ElementListConstIter Iter1 = Elements.begin();
ElementListConstIter Iter2 = RHS.Elements.begin();
- while (Iter2 != RHS.Elements.end()) {
- if (Iter1->index() != Iter2->index()
- || *Iter1 != *Iter2)
+ for (; Iter1 != Elements.end() && Iter2 != RHS.Elements.end();
+ ++Iter1, ++Iter2) {
+ if (*Iter1 != *Iter2)
return false;
- ++Iter1;
- ++Iter2;
}
- return Iter1 == Elements.end();
+ return Iter1 == Elements.end() && Iter2 == RHS.Elements.end();
}
// Union our bitmap with the RHS and return true if we changed.
@@ -789,6 +794,17 @@
return iterator(this, ~0);
}
+ // Get a hash value for this bitmap.
+ uint64_t getHashValue() const {
+ uint64_t HashVal = 0;
+ for (ElementListConstIter Iter = Elements.begin();
+ Iter != Elements.end();
+ ++Iter) {
+ HashVal ^= Iter->index();
+ HashVal ^= Iter->getHashValue();
+ }
+ return HashVal;
+ }
};
// Convenience functions to allow Or and And without dereferencing in the user
@@ -828,9 +844,10 @@
for (bi = LHS.begin(); bi != LHS.end(); ++bi) {
out << *bi << " ";
}
- out << "\n";
+ out << " ]\n";
}
-
}
+
+
#endif
Modified: llvm/trunk/lib/Analysis/IPA/Andersens.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/IPA/Andersens.cpp?rev=42269&r1=42268&r2=42269&view=diff
==============================================================================
--- llvm/trunk/lib/Analysis/IPA/Andersens.cpp (original)
+++ llvm/trunk/lib/Analysis/IPA/Andersens.cpp Mon Sep 24 14:45:49 2007
@@ -16,7 +16,8 @@
// This algorithm is implemented as three stages:
// 1. Object identification.
// 2. Inclusion constraint identification.
-// 3. Inclusion constraint solving.
+// 3. Offline constraint graph optimization
+// 4. Inclusion constraint solving.
//
// The object identification stage identifies all of the memory objects in the
// program, which includes globals, heap allocated objects, and stack allocated
@@ -29,20 +30,25 @@
// B can point to. Constraints can handle copies, loads, and stores, and
// address taking.
//
+// The Offline constraint graph optimization portion includes offline variable
+// substitution algorithms intended to pointer and location equivalences.
+// Pointer equivalences are those pointers that will have the same points-to
+// sets, and location equivalences are those variables that always appear
+// together in points-to sets.
+//
// The inclusion constraint solving phase iteratively propagates the inclusion
// constraints until a fixed point is reached. This is an O(N^3) algorithm.
//
// Function constraints are handled as if they were structs with X fields.
// Thus, an access to argument X of function Y is an access to node index
// getNode(Y) + X. This representation allows handling of indirect calls
-// without any issues. To wit, an indirect call Y(a,b) is equivalence to
+// without any issues. To wit, an indirect call Y(a,b) is equivalent to
// *(Y + 1) = a, *(Y + 2) = b.
// The return node for a function is always located at getNode(F) +
// CallReturnPos. The arguments start at getNode(F) + CallArgPos.
//
// Future Improvements:
-// Offline variable substitution, offline detection of online
-// cycles. Use of BDD's.
+// Offline detection of online cycles. Use of BDD's.
//===----------------------------------------------------------------------===//
#define DEBUG_TYPE "anders-aa"
@@ -59,6 +65,7 @@
#include "llvm/Support/Debug.h"
#include "llvm/ADT/Statistic.h"
#include "llvm/ADT/SparseBitVector.h"
+#include "llvm/ADT/DenseMap.h"
#include
#include
#include
@@ -66,18 +73,42 @@
#include
using namespace llvm;
-STATISTIC(NumIters , "Number of iterations to reach convergence");
-STATISTIC(NumConstraints , "Number of constraints");
-STATISTIC(NumNodes , "Number of nodes");
-STATISTIC(NumUnified , "Number of variables unified");
+STATISTIC(NumIters , "Number of iterations to reach convergence");
+STATISTIC(NumConstraints, "Number of constraints");
+STATISTIC(NumNodes , "Number of nodes");
+STATISTIC(NumUnified , "Number of variables unified");
namespace {
const unsigned SelfRep = (unsigned)-1;
const unsigned Unvisited = (unsigned)-1;
// Position of the function return node relative to the function node.
- const unsigned CallReturnPos = 2;
+ const unsigned CallReturnPos = 1;
// Position of the function call node relative to the function node.
- const unsigned CallFirstArgPos = 3;
+ const unsigned CallFirstArgPos = 2;
+
+ struct BitmapKeyInfo {
+ static inline SparseBitVector<> *getEmptyKey() {
+ return reinterpret_cast *>(-1);
+ }
+ static inline SparseBitVector<> *getTombstoneKey() {
+ return reinterpret_cast *>(-2);
+ }
+ static unsigned getHashValue(const SparseBitVector<> *bitmap) {
+ return bitmap->getHashValue();
+ }
+ static bool isEqual(const SparseBitVector<> *LHS,
+ const SparseBitVector<> *RHS) {
+ if (LHS == RHS)
+ return true;
+ else if (LHS == getEmptyKey() || RHS == getEmptyKey()
+ || LHS == getTombstoneKey() || RHS == getTombstoneKey())
+ return false;
+
+ return *LHS == *RHS;
+ }
+
+ static bool isPod() { return true; }
+ };
class VISIBILITY_HIDDEN Andersens : public ModulePass, public AliasAnalysis,
private InstVisitor {
@@ -89,7 +120,7 @@
/// 'store' for statements like "*A = B", and AddressOf for statements like
/// A = alloca; The Offset is applied as *(A + K) = B for stores,
/// A = *(B + K) for loads, and A = B + K for copies. It is
- /// illegal on addressof constraints (Because it is statically
+ /// illegal on addressof constraints (because it is statically
/// resolvable to A = &C where C = B + K)
struct Constraint {
@@ -105,29 +136,53 @@
}
};
- // Node class - This class is used to represent a node
- // in the constraint graph. Due to various optimizations,
- // not always the case that there is a mapping from a Node to a
- // Value. In particular, we add artificial
- // Node's that represent the set of pointed-to variables
- // shared for each location equivalent Node.
+ // Node class - This class is used to represent a node in the constraint
+ // graph. Due to various optimizations, not always the case that there is a
+ // mapping from a Node to a Value. In particular, we add artificial Node's
+ // that represent the set of pointed-to variables shared for each location
+ // equivalent Node.
struct Node {
- Value *Val;
+ Value *Val;
SparseBitVector<> *Edges;
SparseBitVector<> *PointsTo;
SparseBitVector<> *OldPointsTo;
bool Changed;
std::list Constraints;
- // Nodes in cycles (or in equivalence classes) are united
- // together using a standard union-find representation with path
- // compression. NodeRep gives the index into GraphNodes
- // representative for this one.
- unsigned NodeRep; public:
-
- Node() : Val(0), Edges(0), PointsTo(0), OldPointsTo(0), Changed(false),
- NodeRep(SelfRep) {
- }
+ // Pointer and location equivalence labels
+ unsigned PointerEquivLabel;
+ unsigned LocationEquivLabel;
+ // Predecessor edges, both real and implicit
+ SparseBitVector<> *PredEdges;
+ SparseBitVector<> *ImplicitPredEdges;
+ // Set of nodes that point to us, only use for location equivalence.
+ SparseBitVector<> *PointedToBy;
+ // Number of incoming edges, used during variable substitution to early
+ // free the points-to sets
+ unsigned NumInEdges;
+ // True if our ponits-to set is in the Set2PEClass map
+ bool StoredInHash;
+ // True if our node has no indirect constraints (Complex or otherwise)
+ bool Direct;
+ // True if the node is address taken, *or* it is part of a group of nodes
+ // that must be kept together. This is set to true for functions and
+ // their arg nodes, which must be kept at the same position relative to
+ // their base function node.
+ // kept at the same position relative to their base function node.
+ bool AddressTaken;
+
+ // Nodes in cycles (or in equivalence classes) are united together using a
+ // standard union-find representation with path compression. NodeRep
+ // gives the index into GraphNodes for the representative Node.
+ unsigned NodeRep;
+ public:
+
+ Node(bool direct = true) :
+ Val(0), Edges(0), PointsTo(0), OldPointsTo(0), Changed(false),
+ PointerEquivLabel(0), LocationEquivLabel(0), PredEdges(0),
+ ImplicitPredEdges(0), PointedToBy(0), NumInEdges(0),
+ StoredInHash(false), Direct(direct), AddressTaken(false),
+ NodeRep(SelfRep) { }
Node *setValue(Value *V) {
assert(Val == 0 && "Value already set for this node!");
@@ -163,28 +218,28 @@
/// ValueNodes - This map indicates the Node that a particular Value* is
/// represented by. This contains entries for all pointers.
- std::map ValueNodes;
+ DenseMap ValueNodes;
/// ObjectNodes - This map contains entries for each memory object in the
/// program: globals, alloca's and mallocs.
- std::map ObjectNodes;
+ DenseMap ObjectNodes;
/// ReturnNodes - This map contains an entry for each function in the
/// program that returns a value.
- std::map ReturnNodes;
+ DenseMap ReturnNodes;
/// VarargNodes - This map contains the entry used to represent all pointers
/// passed through the varargs portion of a function call for a particular
/// function. An entry is not present in this map for functions that do not
/// take variable arguments.
- std::map VarargNodes;
+ DenseMap VarargNodes;
/// Constraints - This vector contains a list of all of the constraints
/// identified by the program.
std::vector Constraints;
- // Map from graph node to maximum K value that is allowed (For functions,
+ // Map from graph node to maximum K value that is allowed (for functions,
// this is equivalent to the number of arguments + CallFirstArgPos)
std::map MaxK;
@@ -193,9 +248,10 @@
enum {
UniversalSet = 0,
NullPtr = 1,
- NullObject = 2
+ NullObject = 2,
+ NumberSpecialNodes
};
- // Stack for Tarjans
+ // Stack for Tarjan's
std::stack SCCStack;
// Topological Index -> Graph node
std::vector Topo2Node;
@@ -209,6 +265,34 @@
unsigned DFSNumber;
unsigned RPONumber;
+ // Offline variable substitution related things
+
+ // Temporary rep storage, used because we can't collapse SCC's in the
+ // predecessor graph by uniting the variables permanently, we can only do so
+ // for the successor graph.
+ std::vector VSSCCRep;
+ // Mapping from node to whether we have visited it during SCC finding yet.
+ std::vector Node2Visited;
+ // During variable substitution, we create unknowns to represent the unknown
+ // value that is a dereference of a variable. These nodes are known as
+ // "ref" nodes (since they represent the value of dereferences).
+ unsigned FirstRefNode;
+ // During HVN, we create represent address taken nodes as if they were
+ // unknown (since HVN, unlike HU, does not evaluate unions).
+ unsigned FirstAdrNode;
+ // Current pointer equivalence class number
+ unsigned PEClass;
+ // Mapping from points-to sets to equivalence classes
+ typedef DenseMap *, unsigned, BitmapKeyInfo> BitVectorMap;
+ BitVectorMap Set2PEClass;
+ // Mapping from pointer equivalences to the representative node. -1 if we
+ // have no representative node for this pointer equivalence class yet.
+ std::vector PEClass2Node;
+ // Mapping from pointer equivalences to representative node. This includes
+ // pointer equivalent but not location equivalent variables. -1 if we have
+ // no representative node for this pointer equivalence class yet.
+ std::vector PENLEClass2Node;
+
public:
static char ID;
Andersens() : ModulePass((intptr_t)&ID) {}
@@ -217,7 +301,11 @@
InitializeAliasAnalysis(this);
IdentifyObjects(M);
CollectConstraints(M);
+#undef DEBUG_TYPE
+#define DEBUG_TYPE "anders-aa-constraints"
DEBUG(PrintConstraints());
+#undef DEBUG_TYPE
+#define DEBUG_TYPE "anders-aa"
SolveConstraints();
DEBUG(PrintPointsToGraph());
@@ -275,7 +363,7 @@
if (!isa(C))
return getNodeForConstantPointer(C);
- std::map::iterator I = ValueNodes.find(V);
+ DenseMap::iterator I = ValueNodes.find(V);
if (I == ValueNodes.end()) {
#ifndef NDEBUG
V->dump();
@@ -288,7 +376,7 @@
/// getObject - Return the node corresponding to the memory object for the
/// specified global or allocation instruction.
unsigned getObject(Value *V) {
- std::map::iterator I = ObjectNodes.find(V);
+ DenseMap::iterator I = ObjectNodes.find(V);
assert(I != ObjectNodes.end() &&
"Value does not have an object in the points-to graph!");
return I->second;
@@ -297,7 +385,7 @@
/// getReturnNode - Return the node representing the return value for the
/// specified function.
unsigned getReturnNode(Function *F) {
- std::map::iterator I = ReturnNodes.find(F);
+ DenseMap::iterator I = ReturnNodes.find(F);
assert(I != ReturnNodes.end() && "Function does not return a value!");
return I->second;
}
@@ -305,7 +393,7 @@
/// getVarargNode - Return the node representing the variable arguments
/// formal for the specified function.
unsigned getVarargNode(Function *F) {
- std::map::iterator I = VarargNodes.find(F);
+ DenseMap::iterator I = VarargNodes.find(F);
assert(I != VarargNodes.end() && "Function does not take var args!");
return I->second;
}
@@ -325,9 +413,18 @@
void CollectConstraints(Module &M);
bool AnalyzeUsesOfFunction(Value *);
void CreateConstraintGraph();
+ void OptimizeConstraints();
+ unsigned FindEquivalentNode(unsigned, unsigned);
+ void ClumpAddressTaken();
+ void RewriteConstraints();
+ void HU();
+ void HVN();
+ void UnitePointerEquivalences();
void SolveConstraints();
void QueryNode(unsigned Node);
-
+ void Condense(unsigned Node);
+ void HUValNum(unsigned Node);
+ void HVNValNum(unsigned Node);
unsigned getNodeForConstantPointer(Constant *C);
unsigned getNodeForConstantPointerTarget(Constant *C);
void AddGlobalInitializerConstraints(unsigned, Constant *C);
@@ -339,6 +436,8 @@
void PrintNode(Node *N);
void PrintConstraints();
+ void PrintConstraint(const Constraint &);
+ void PrintLabels();
void PrintPointsToGraph();
//===------------------------------------------------------------------===//
@@ -506,7 +605,6 @@
// The function itself is a memory object.
unsigned First = NumObjects;
ValueNodes[F] = NumObjects++;
- ObjectNodes[F] = NumObjects++;
if (isa(F->getFunctionType()->getReturnType()))
ReturnNodes[F] = NumObjects++;
if (F->getFunctionType()->isVarArg())
@@ -516,10 +614,11 @@
// Add nodes for all of the incoming pointer arguments.
for (Function::arg_iterator I = F->arg_begin(), E = F->arg_end();
I != E; ++I)
- if (isa(I->getType()))
- ValueNodes[I] = NumObjects++;
+ {
+ if (isa(I->getType()))
+ ValueNodes[I] = NumObjects++;
+ }
MaxK[First] = NumObjects - First;
- MaxK[First + 1] = NumObjects - First - 1;
// Scan the function body, creating a memory object for each heap/stack
// allocation in the body of the function and a node to represent all
@@ -796,11 +895,6 @@
}
for (Module::iterator F = M.begin(), E = M.end(); F != E; ++F) {
- // Make the function address point to the function object.
- unsigned ObjectIndex = getObject(F);
- GraphNodes[ObjectIndex].setValue(F);
- Constraints.push_back(Constraint(Constraint::AddressOf, getNodeValue(*F),
- ObjectIndex));
// Set up the return value node.
if (isa(F->getFunctionType()->getReturnType()))
GraphNodes[getReturnNode(F)].setValue(F);
@@ -1091,8 +1185,736 @@
return Result;
}
-// Create the constraint graph used for solving points-to analysis.
-//
+void dumpToDOUT(SparseBitVector<> *bitmap) {
+ dump(*bitmap, DOUT);
+}
+
+
+/// Clump together address taken variables so that the points-to sets use up
+/// less space and can be operated on faster.
+
+void Andersens::ClumpAddressTaken() {
+#undef DEBUG_TYPE
+#define DEBUG_TYPE "anders-aa-renumber"
+ std::vector Translate;
+ std::vector NewGraphNodes;
+
+ Translate.resize(GraphNodes.size());
+ unsigned NewPos = 0;
+
+ for (unsigned i = 0; i < Constraints.size(); ++i) {
+ Constraint &C = Constraints[i];
+ if (C.Type == Constraint::AddressOf) {
+ GraphNodes[C.Src].AddressTaken = true;
+ }
+ }
+ for (unsigned i = 0; i < NumberSpecialNodes; ++i) {
+ unsigned Pos = NewPos++;
+ Translate[i] = Pos;
+ NewGraphNodes.push_back(GraphNodes[i]);
+ DOUT << "Renumbering node " << i << " to node " << Pos << "\n";
+ }
+
+ // I believe this ends up being faster than making two vectors and splicing
+ // them.
+ for (unsigned i = NumberSpecialNodes; i < GraphNodes.size(); ++i) {
+ if (GraphNodes[i].AddressTaken) {
+ unsigned Pos = NewPos++;
+ Translate[i] = Pos;
+ NewGraphNodes.push_back(GraphNodes[i]);
+ DOUT << "Renumbering node " << i << " to node " << Pos << "\n";
+ }
+ }
+
+ for (unsigned i = NumberSpecialNodes; i < GraphNodes.size(); ++i) {
+ if (!GraphNodes[i].AddressTaken) {
+ unsigned Pos = NewPos++;
+ Translate[i] = Pos;
+ NewGraphNodes.push_back(GraphNodes[i]);
+ DOUT << "Renumbering node " << i << " to node " << Pos << "\n";
+ }
+ }
+
+ for (DenseMap::iterator Iter = ValueNodes.begin();
+ Iter != ValueNodes.end();
+ ++Iter)
+ Iter->second = Translate[Iter->second];
+
+ for (DenseMap::iterator Iter = ObjectNodes.begin();
+ Iter != ObjectNodes.end();
+ ++Iter)
+ Iter->second = Translate[Iter->second];
+
+ for (DenseMap::iterator Iter = ReturnNodes.begin();
+ Iter != ReturnNodes.end();
+ ++Iter)
+ Iter->second = Translate[Iter->second];
+
+ for (DenseMap::iterator Iter = VarargNodes.begin();
+ Iter != VarargNodes.end();
+ ++Iter)
+ Iter->second = Translate[Iter->second];
+
+ for (unsigned i = 0; i < Constraints.size(); ++i) {
+ Constraint &C = Constraints[i];
+ C.Src = Translate[C.Src];
+ C.Dest = Translate[C.Dest];
+ }
+
+ GraphNodes.swap(NewGraphNodes);
+#undef DEBUG_TYPE
+#define DEBUG_TYPE "anders-aa"
+}
+
+/// The technique used here is described in "Exploiting Pointer and Location
+/// Equivalence to Optimize Pointer Analysis. In the 14th International Static
+/// Analysis Symposium (SAS), August 2007." It is known as the "HVN" algorithm,
+/// and is equivalent to value numbering the collapsed constraint graph without
+/// evaluating unions. This is used as a pre-pass to HU in order to resolve
+/// first order pointer dereferences and speed up/reduce memory usage of HU.
+/// Running both is equivalent to HRU without the iteration
+/// HVN in more detail:
+/// Imagine the set of constraints was simply straight line code with no loops
+/// (we eliminate cycles, so there are no loops), such as:
+/// E = &D
+/// E = &C
+/// E = F
+/// F = G
+/// G = F
+/// Applying value numbering to this code tells us:
+/// G == F == E
+///
+/// For HVN, this is as far as it goes. We assign new value numbers to every
+/// "address node", and every "reference node".
+/// To get the optimal result for this, we use a DFS + SCC (since all nodes in a
+/// cycle must have the same value number since the = operation is really
+/// inclusion, not overwrite), and value number nodes we receive points-to sets
+/// before we value our own node.
+/// The advantage of HU over HVN is that HU considers the inclusion property, so
+/// that if you have
+/// E = &D
+/// E = &C
+/// E = F
+/// F = G
+/// F = &D
+/// G = F
+/// HU will determine that G == F == E. HVN will not, because it cannot prove
+/// that the points to information ends up being the same because they all
+/// receive &D from E anyway.
+
+void Andersens::HVN() {
+ DOUT << "Beginning HVN\n";
+ // Build a predecessor graph. This is like our constraint graph with the
+ // edges going in the opposite direction, and there are edges for all the
+ // constraints, instead of just copy constraints. We also build implicit
+ // edges for constraints are implied but not explicit. I.E for the constraint
+ // a = &b, we add implicit edges *a = b. This helps us capture more cycles
+ for (unsigned i = 0, e = Constraints.size(); i != e; ++i) {
+ Constraint &C = Constraints[i];
+ if (C.Type == Constraint::AddressOf) {
+ GraphNodes[C.Src].AddressTaken = true;
+ GraphNodes[C.Src].Direct = false;
+
+ // Dest = &src edge
+ unsigned AdrNode = C.Src + FirstAdrNode;
+ if (!GraphNodes[C.Dest].PredEdges)
+ GraphNodes[C.Dest].PredEdges = new SparseBitVector<>;
+ GraphNodes[C.Dest].PredEdges->set(AdrNode);
+
+ // *Dest = src edge
+ unsigned RefNode = C.Dest + FirstRefNode;
+ if (!GraphNodes[RefNode].ImplicitPredEdges)
+ GraphNodes[RefNode].ImplicitPredEdges = new SparseBitVector<>;
+ GraphNodes[RefNode].ImplicitPredEdges->set(C.Src);
+ } else if (C.Type == Constraint::Load) {
+ if (C.Offset == 0) {
+ // dest = *src edge
+ if (!GraphNodes[C.Dest].PredEdges)
+ GraphNodes[C.Dest].PredEdges = new SparseBitVector<>;
+ GraphNodes[C.Dest].PredEdges->set(C.Src + FirstRefNode);
+ } else {
+ GraphNodes[C.Dest].Direct = false;
+ }
+ } else if (C.Type == Constraint::Store) {
+ if (C.Offset == 0) {
+ // *dest = src edge
+ unsigned RefNode = C.Dest + FirstRefNode;
+ if (!GraphNodes[RefNode].PredEdges)
+ GraphNodes[RefNode].PredEdges = new SparseBitVector<>;
+ GraphNodes[RefNode].PredEdges->set(C.Src);
+ }
+ } else {
+ // Dest = Src edge and *Dest = *Src edge
+ if (!GraphNodes[C.Dest].PredEdges)
+ GraphNodes[C.Dest].PredEdges = new SparseBitVector<>;
+ GraphNodes[C.Dest].PredEdges->set(C.Src);
+ unsigned RefNode = C.Dest + FirstRefNode;
+ if (!GraphNodes[RefNode].ImplicitPredEdges)
+ GraphNodes[RefNode].ImplicitPredEdges = new SparseBitVector<>;
+ GraphNodes[RefNode].ImplicitPredEdges->set(C.Src + FirstRefNode);
+ }
+ }
+ PEClass = 1;
+ // Do SCC finding first to condense our predecessor graph
+ DFSNumber = 0;
+ Node2DFS.insert(Node2DFS.begin(), GraphNodes.size(), 0);
+ Node2Deleted.insert(Node2Deleted.begin(), GraphNodes.size(), false);
+ Node2Visited.insert(Node2Visited.begin(), GraphNodes.size(), false);
+
+ for (unsigned i = 0; i < FirstRefNode; ++i) {
+ unsigned Node = VSSCCRep[i];
+ if (!Node2Visited[Node])
+ HVNValNum(Node);
+ }
+ for (BitVectorMap::iterator Iter = Set2PEClass.begin();
+ Iter != Set2PEClass.end();
+ ++Iter)
+ delete Iter->first;
+ Set2PEClass.clear();
+ Node2DFS.clear();
+ Node2Deleted.clear();
+ Node2Visited.clear();
+ DOUT << "Finished HVN\n";
+
+}
+
+/// This is the workhorse of HVN value numbering. We combine SCC finding at the
+/// same time because it's easy.
+void Andersens::HVNValNum(unsigned NodeIndex) {
+ unsigned MyDFS = DFSNumber++;
+ Node *N = &GraphNodes[NodeIndex];
+ Node2Visited[NodeIndex] = true;
+ Node2DFS[NodeIndex] = MyDFS;
+
+ // First process all our explicit edges
+ if (N->PredEdges)
+ for (SparseBitVector<>::iterator Iter = N->PredEdges->begin();
+ Iter != N->PredEdges->end();
+ ++Iter) {
+ unsigned j = VSSCCRep[*Iter];
+ if (!Node2Deleted[j]) {
+ if (!Node2Visited[j])
+ HVNValNum(j);
+ if (Node2DFS[NodeIndex] > Node2DFS[j])
+ Node2DFS[NodeIndex] = Node2DFS[j];
+ }
+ }
+
+ // Now process all the implicit edges
+ if (N->ImplicitPredEdges)
+ for (SparseBitVector<>::iterator Iter = N->ImplicitPredEdges->begin();
+ Iter != N->ImplicitPredEdges->end();
+ ++Iter) {
+ unsigned j = VSSCCRep[*Iter];
+ if (!Node2Deleted[j]) {
+ if (!Node2Visited[j])
+ HVNValNum(j);
+ if (Node2DFS[NodeIndex] > Node2DFS[j])
+ Node2DFS[NodeIndex] = Node2DFS[j];
+ }
+ }
+
+ // See if we found any cycles
+ if (MyDFS == Node2DFS[NodeIndex]) {
+ while (!SCCStack.empty() && Node2DFS[SCCStack.top()] >= MyDFS) {
+ unsigned CycleNodeIndex = SCCStack.top();
+ Node *CycleNode = &GraphNodes[CycleNodeIndex];
+ VSSCCRep[CycleNodeIndex] = NodeIndex;
+ // Unify the nodes
+ N->Direct &= CycleNode->Direct;
+
+ if (CycleNode->PredEdges) {
+ if (!N->PredEdges)
+ N->PredEdges = new SparseBitVector<>;
+ *(N->PredEdges) |= CycleNode->PredEdges;
+ delete CycleNode->PredEdges;
+ CycleNode->PredEdges = NULL;
+ }
+ if (CycleNode->ImplicitPredEdges) {
+ if (!N->ImplicitPredEdges)
+ N->ImplicitPredEdges = new SparseBitVector<>;
+ *(N->ImplicitPredEdges) |= CycleNode->ImplicitPredEdges;
+ delete CycleNode->ImplicitPredEdges;
+ CycleNode->ImplicitPredEdges = NULL;
+ }
+
+ SCCStack.pop();
+ }
+
+ Node2Deleted[NodeIndex] = true;
+
+ if (!N->Direct) {
+ GraphNodes[NodeIndex].PointerEquivLabel = PEClass++;
+ return;
+ }
+
+ // Collect labels of successor nodes
+ bool AllSame = true;
+ unsigned First = ~0;
+ SparseBitVector<> *Labels = new SparseBitVector<>;
+ bool Used = false;
+
+ if (N->PredEdges)
+ for (SparseBitVector<>::iterator Iter = N->PredEdges->begin();
+ Iter != N->PredEdges->end();
+ ++Iter) {
+ unsigned j = VSSCCRep[*Iter];
+ unsigned Label = GraphNodes[j].PointerEquivLabel;
+ // Ignore labels that are equal to us or non-pointers
+ if (j == NodeIndex || Label == 0)
+ continue;
+ if (First == (unsigned)~0)
+ First = Label;
+ else if (First != Label)
+ AllSame = false;
+ Labels->set(Label);
+ }
+
+ // We either have a non-pointer, a copy of an existing node, or a new node.
+ // Assign the appropriate pointer equivalence label.
+ if (Labels->empty()) {
+ GraphNodes[NodeIndex].PointerEquivLabel = 0;
+ } else if (AllSame) {
+ GraphNodes[NodeIndex].PointerEquivLabel = First;
+ } else {
+ GraphNodes[NodeIndex].PointerEquivLabel = Set2PEClass[Labels];
+ if (GraphNodes[NodeIndex].PointerEquivLabel == 0) {
+ unsigned EquivClass = PEClass++;
+ Set2PEClass[Labels] = EquivClass;
+ GraphNodes[NodeIndex].PointerEquivLabel = EquivClass;
+ Used = true;
+ }
+ }
+ if (!Used)
+ delete Labels;
+ } else {
+ SCCStack.push(NodeIndex);
+ }
+}
+
+/// The technique used here is described in "Exploiting Pointer and Location
+/// Equivalence to Optimize Pointer Analysis. In the 14th International Static
+/// Analysis Symposium (SAS), August 2007." It is known as the "HU" algorithm,
+/// and is equivalent to value numbering the collapsed constraint graph
+/// including evaluating unions.
+void Andersens::HU() {
+ DOUT << "Beginning HU\n";
+ // Build a predecessor graph. This is like our constraint graph with the
+ // edges going in the opposite direction, and there are edges for all the
+ // constraints, instead of just copy constraints. We also build implicit
+ // edges for constraints are implied but not explicit. I.E for the constraint
+ // a = &b, we add implicit edges *a = b. This helps us capture more cycles
+ for (unsigned i = 0, e = Constraints.size(); i != e; ++i) {
+ Constraint &C = Constraints[i];
+ if (C.Type == Constraint::AddressOf) {
+ GraphNodes[C.Src].AddressTaken = true;
+ GraphNodes[C.Src].Direct = false;
+
+ GraphNodes[C.Dest].PointsTo->set(C.Src);
+ // *Dest = src edge
+ unsigned RefNode = C.Dest + FirstRefNode;
+ if (!GraphNodes[RefNode].ImplicitPredEdges)
+ GraphNodes[RefNode].ImplicitPredEdges = new SparseBitVector<>;
+ GraphNodes[RefNode].ImplicitPredEdges->set(C.Src);
+ GraphNodes[C.Src].PointedToBy->set(C.Dest);
+ } else if (C.Type == Constraint::Load) {
+ if (C.Offset == 0) {
+ // dest = *src edge
+ if (!GraphNodes[C.Dest].PredEdges)
+ GraphNodes[C.Dest].PredEdges = new SparseBitVector<>;
+ GraphNodes[C.Dest].PredEdges->set(C.Src + FirstRefNode);
+ } else {
+ GraphNodes[C.Dest].Direct = false;
+ }
+ } else if (C.Type == Constraint::Store) {
+ if (C.Offset == 0) {
+ // *dest = src edge
+ unsigned RefNode = C.Dest + FirstRefNode;
+ if (!GraphNodes[RefNode].PredEdges)
+ GraphNodes[RefNode].PredEdges = new SparseBitVector<>;
+ GraphNodes[RefNode].PredEdges->set(C.Src);
+ }
+ } else {
+ // Dest = Src edge and *Dest = *Src edg
+ if (!GraphNodes[C.Dest].PredEdges)
+ GraphNodes[C.Dest].PredEdges = new SparseBitVector<>;
+ GraphNodes[C.Dest].PredEdges->set(C.Src);
+ unsigned RefNode = C.Dest + FirstRefNode;
+ if (!GraphNodes[RefNode].ImplicitPredEdges)
+ GraphNodes[RefNode].ImplicitPredEdges = new SparseBitVector<>;
+ GraphNodes[RefNode].ImplicitPredEdges->set(C.Src + FirstRefNode);
+ }
+ }
+ PEClass = 1;
+ // Do SCC finding first to condense our predecessor graph
+ DFSNumber = 0;
+ Node2DFS.insert(Node2DFS.begin(), GraphNodes.size(), 0);
+ Node2Deleted.insert(Node2Deleted.begin(), GraphNodes.size(), false);
+ Node2Visited.insert(Node2Visited.begin(), GraphNodes.size(), false);
+
+ for (unsigned i = 0; i < FirstRefNode; ++i) {
+ if (FindNode(i) == i) {
+ unsigned Node = VSSCCRep[i];
+ if (!Node2Visited[Node])
+ Condense(Node);
+ }
+ }
+
+ // Reset tables for actual labeling
+ Node2DFS.clear();
+ Node2Visited.clear();
+ Node2Deleted.clear();
+ // Pre-grow our densemap so that we don't get really bad behavior
+ Set2PEClass.resize(GraphNodes.size());
+
+ // Visit the condensed graph and generate pointer equivalence labels.
+ Node2Visited.insert(Node2Visited.begin(), GraphNodes.size(), false);
+ for (unsigned i = 0; i < FirstRefNode; ++i) {
+ if (FindNode(i) == i) {
+ unsigned Node = VSSCCRep[i];
+ if (!Node2Visited[Node])
+ HUValNum(Node);
+ }
+ }
+ // PEClass nodes will be deleted by the deleting of N->PointsTo in our caller.
+ Set2PEClass.clear();
+ DOUT << "Finished HU\n";
+}
+
+
+/// Implementation of standard Tarjan SCC algorithm as modified by Nuutilla.
+void Andersens::Condense(unsigned NodeIndex) {
+ unsigned MyDFS = DFSNumber++;
+ Node *N = &GraphNodes[NodeIndex];
+ Node2Visited[NodeIndex] = true;
+ Node2DFS[NodeIndex] = MyDFS;
+
+ // First process all our explicit edges
+ if (N->PredEdges)
+ for (SparseBitVector<>::iterator Iter = N->PredEdges->begin();
+ Iter != N->PredEdges->end();
+ ++Iter) {
+ unsigned j = VSSCCRep[*Iter];
+ if (!Node2Deleted[j]) {
+ if (!Node2Visited[j])
+ Condense(j);
+ if (Node2DFS[NodeIndex] > Node2DFS[j])
+ Node2DFS[NodeIndex] = Node2DFS[j];
+ }
+ }
+
+ // Now process all the implicit edges
+ if (N->ImplicitPredEdges)
+ for (SparseBitVector<>::iterator Iter = N->ImplicitPredEdges->begin();
+ Iter != N->ImplicitPredEdges->end();
+ ++Iter) {
+ unsigned j = VSSCCRep[*Iter];
+ if (!Node2Deleted[j]) {
+ if (!Node2Visited[j])
+ Condense(j);
+ if (Node2DFS[NodeIndex] > Node2DFS[j])
+ Node2DFS[NodeIndex] = Node2DFS[j];
+ }
+ }
+
+ // See if we found any cycles
+ if (MyDFS == Node2DFS[NodeIndex]) {
+ while (!SCCStack.empty() && Node2DFS[SCCStack.top()] >= MyDFS) {
+ unsigned CycleNodeIndex = SCCStack.top();
+ Node *CycleNode = &GraphNodes[CycleNodeIndex];
+ VSSCCRep[CycleNodeIndex] = NodeIndex;
+ // Unify the nodes
+ N->Direct &= CycleNode->Direct;
+
+ *(N->PointsTo) |= CycleNode->PointsTo;
+ delete CycleNode->PointsTo;
+ CycleNode->PointsTo = NULL;
+ if (CycleNode->PredEdges) {
+ if (!N->PredEdges)
+ N->PredEdges = new SparseBitVector<>;
+ *(N->PredEdges) |= CycleNode->PredEdges;
+ delete CycleNode->PredEdges;
+ CycleNode->PredEdges = NULL;
+ }
+ if (CycleNode->ImplicitPredEdges) {
+ if (!N->ImplicitPredEdges)
+ N->ImplicitPredEdges = new SparseBitVector<>;
+ *(N->ImplicitPredEdges) |= CycleNode->ImplicitPredEdges;
+ delete CycleNode->ImplicitPredEdges;
+ CycleNode->ImplicitPredEdges = NULL;
+ }
+ SCCStack.pop();
+ }
+
+ Node2Deleted[NodeIndex] = true;
+
+ // Set up number of incoming edges for other nodes
+ if (N->PredEdges)
+ for (SparseBitVector<>::iterator Iter = N->PredEdges->begin();
+ Iter != N->PredEdges->end();
+ ++Iter)
+ ++GraphNodes[VSSCCRep[*Iter]].NumInEdges;
+ } else {
+ SCCStack.push(NodeIndex);
+ }
+}
+
+void Andersens::HUValNum(unsigned NodeIndex) {
+ Node *N = &GraphNodes[NodeIndex];
+ Node2Visited[NodeIndex] = true;
+
+ // Eliminate dereferences of non-pointers for those non-pointers we have
+ // already identified. These are ref nodes whose non-ref node:
+ // 1. Has already been visited determined to point to nothing (and thus, a
+ // dereference of it must point to nothing)
+ // 2. Any direct node with no predecessor edges in our graph and with no
+ // points-to set (since it can't point to anything either, being that it
+ // receives no points-to sets and has none).
+ if (NodeIndex >= FirstRefNode) {
+ unsigned j = VSSCCRep[FindNode(NodeIndex - FirstRefNode)];
+ if ((Node2Visited[j] && !GraphNodes[j].PointerEquivLabel)
+ || (GraphNodes[j].Direct && !GraphNodes[j].PredEdges
+ && GraphNodes[j].PointsTo->empty())){
+ return;
+ }
+ }
+ // Process all our explicit edges
+ if (N->PredEdges)
+ for (SparseBitVector<>::iterator Iter = N->PredEdges->begin();
+ Iter != N->PredEdges->end();
+ ++Iter) {
+ unsigned j = VSSCCRep[*Iter];
+ if (!Node2Visited[j])
+ HUValNum(j);
+
+ // If this edge turned out to be the same as us, or got no pointer
+ // equivalence label (and thus points to nothing) , just decrement our
+ // incoming edges and continue.
+ if (j == NodeIndex || GraphNodes[j].PointerEquivLabel == 0) {
+ --GraphNodes[j].NumInEdges;
+ continue;
+ }
+
+ *(N->PointsTo) |= GraphNodes[j].PointsTo;
+
+ // If we didn't end up storing this in the hash, and we're done with all
+ // the edges, we don't need the points-to set anymore.
+ --GraphNodes[j].NumInEdges;
+ if (!GraphNodes[j].NumInEdges && !GraphNodes[j].StoredInHash) {
+ delete GraphNodes[j].PointsTo;
+ GraphNodes[j].PointsTo = NULL;
+ }
+ }
+ // If this isn't a direct node, generate a fresh variable.
+ if (!N->Direct) {
+ N->PointsTo->set(FirstRefNode + NodeIndex);
+ }
+
+ // See If we have something equivalent to us, if not, generate a new
+ // equivalence class.
+ if (N->PointsTo->empty()) {
+ delete N->PointsTo;
+ N->PointsTo = NULL;
+ } else {
+ if (N->Direct) {
+ N->PointerEquivLabel = Set2PEClass[N->PointsTo];
+ if (N->PointerEquivLabel == 0) {
+ unsigned EquivClass = PEClass++;
+ N->StoredInHash = true;
+ Set2PEClass[N->PointsTo] = EquivClass;
+ N->PointerEquivLabel = EquivClass;
+ }
+ } else {
+ N->PointerEquivLabel = PEClass++;
+ }
+ }
+}
+
+/// Rewrite our list of constraints so that pointer equivalent nodes are
+/// replaced by their the pointer equivalence class representative.
+void Andersens::RewriteConstraints() {
+ std::vector NewConstraints;
+
+ PEClass2Node.clear();
+ PENLEClass2Node.clear();
+
+ // We may have from 1 to Graphnodes + 1 equivalence classes.
+ PEClass2Node.insert(PEClass2Node.begin(), GraphNodes.size() + 1, -1);
+ PENLEClass2Node.insert(PENLEClass2Node.begin(), GraphNodes.size() + 1, -1);
+
+ // Rewrite constraints, ignoring non-pointer constraints, uniting equivalent
+ // nodes, and rewriting constraints to use the representative nodes.
+ for (unsigned i = 0, e = Constraints.size(); i != e; ++i) {
+ Constraint &C = Constraints[i];
+ unsigned RHSNode = FindNode(C.Src);
+ unsigned LHSNode = FindNode(C.Dest);
+ unsigned RHSLabel = GraphNodes[VSSCCRep[RHSNode]].PointerEquivLabel;
+ unsigned LHSLabel = GraphNodes[VSSCCRep[LHSNode]].PointerEquivLabel;
+
+ // First we try to eliminate constraints for things we can prove don't point
+ // to anything.
+ if (LHSLabel == 0) {
+ DEBUG(PrintNode(&GraphNodes[LHSNode]));
+ DOUT << " is a non-pointer, ignoring constraint.\n";
+ continue;
+ }
+ if (RHSLabel == 0) {
+ DEBUG(PrintNode(&GraphNodes[RHSNode]));
+ DOUT << " is a non-pointer, ignoring constraint.\n";
+ continue;
+ }
+ // This constraint may be useless, and it may become useless as we translate
+ // it.
+ if (C.Src == C.Dest && C.Type == Constraint::Copy)
+ continue;
+
+ C.Src = FindEquivalentNode(RHSNode, RHSLabel);
+ C.Dest = FindEquivalentNode(FindNode(LHSNode), LHSLabel);
+ if (C.Src == C.Dest && C.Type == Constraint::Copy)
+ continue;
+
+ NewConstraints.push_back(C);
+ }
+ Constraints.swap(NewConstraints);
+ PEClass2Node.clear();
+}
+
+/// See if we have a node that is pointer equivalent to the one being asked
+/// about, and if so, unite them and return the equivalent node. Otherwise,
+/// return the original node.
+unsigned Andersens::FindEquivalentNode(unsigned NodeIndex,
+ unsigned NodeLabel) {
+ if (!GraphNodes[NodeIndex].AddressTaken) {
+ if (PEClass2Node[NodeLabel] != -1) {
+ // We found an existing node with the same pointer label, so unify them.
+ return UniteNodes(PEClass2Node[NodeLabel], NodeIndex);
+ } else {
+ PEClass2Node[NodeLabel] = NodeIndex;
+ PENLEClass2Node[NodeLabel] = NodeIndex;
+ }
+ } else if (PENLEClass2Node[NodeLabel] == -1) {
+ PENLEClass2Node[NodeLabel] = NodeIndex;
+ }
+
+ return NodeIndex;
+}
+
+void Andersens::PrintLabels() {
+ for (unsigned i = 0; i < GraphNodes.size(); ++i) {
+ if (i < FirstRefNode) {
+ PrintNode(&GraphNodes[i]);
+ } else if (i < FirstAdrNode) {
+ DOUT << "REF(";
+ PrintNode(&GraphNodes[i-FirstRefNode]);
+ DOUT <<")";
+ } else {
+ DOUT << "ADR(";
+ PrintNode(&GraphNodes[i-FirstAdrNode]);
+ DOUT <<")";
+ }
+
+ DOUT << " has pointer label " << GraphNodes[i].PointerEquivLabel
+ << " and SCC rep " << VSSCCRep[i]
+ << " and is " << (GraphNodes[i].Direct ? "Direct" : "Not direct")
+ << "\n";
+ }
+}
+
+/// Optimize the constraints by performing offline variable substitution and
+/// other optimizations.
+void Andersens::OptimizeConstraints() {
+ DOUT << "Beginning constraint optimization\n";
+
+ // Function related nodes need to stay in the same relative position and can't
+ // be location equivalent.
+ for (std::map::iterator Iter = MaxK.begin();
+ Iter != MaxK.end();
+ ++Iter) {
+ for (unsigned i = Iter->first;
+ i != Iter->first + Iter->second;
+ ++i) {
+ GraphNodes[i].AddressTaken = true;
+ GraphNodes[i].Direct = false;
+ }
+ }
+
+ ClumpAddressTaken();
+ FirstRefNode = GraphNodes.size();
+ FirstAdrNode = FirstRefNode + GraphNodes.size();
+ GraphNodes.insert(GraphNodes.end(), 2 * GraphNodes.size(),
+ Node(false));
+ VSSCCRep.resize(GraphNodes.size());
+ for (unsigned i = 0; i < GraphNodes.size(); ++i) {
+ VSSCCRep[i] = i;
+ }
+ HVN();
+ for (unsigned i = 0; i < GraphNodes.size(); ++i) {
+ Node *N = &GraphNodes[i];
+ delete N->PredEdges;
+ N->PredEdges = NULL;
+ delete N->ImplicitPredEdges;
+ N->ImplicitPredEdges = NULL;
+ }
+#undef DEBUG_TYPE
+#define DEBUG_TYPE "anders-aa-labels"
+ DEBUG(PrintLabels());
+#undef DEBUG_TYPE
+#define DEBUG_TYPE "anders-aa"
+ RewriteConstraints();
+ // Delete the adr nodes.
+ GraphNodes.resize(FirstRefNode * 2);
+
+ // Now perform HU
+ for (unsigned i = 0; i < GraphNodes.size(); ++i) {
+ Node *N = &GraphNodes[i];
+ if (FindNode(i) == i) {
+ N->PointsTo = new SparseBitVector<>;
+ N->PointedToBy = new SparseBitVector<>;
+ // Reset our labels
+ }
+ VSSCCRep[i] = i;
+ N->PointerEquivLabel = 0;
+ }
+ HU();
+#undef DEBUG_TYPE
+#define DEBUG_TYPE "anders-aa-labels"
+ DEBUG(PrintLabels());
+#undef DEBUG_TYPE
+#define DEBUG_TYPE "anders-aa"
+ RewriteConstraints();
+ for (unsigned i = 0; i < GraphNodes.size(); ++i) {
+ if (FindNode(i) == i) {
+ Node *N = &GraphNodes[i];
+ delete N->PointsTo;
+ delete N->PredEdges;
+ delete N->ImplicitPredEdges;
+ delete N->PointedToBy;
+ }
+ }
+ GraphNodes.erase(GraphNodes.begin() + FirstRefNode, GraphNodes.end());
+ DOUT << "Finished constraint optimization\n";
+ FirstRefNode = 0;
+ FirstAdrNode = 0;
+}
+
+/// Unite pointer but not location equivalent variables, now that the constraint
+/// graph is built.
+void Andersens::UnitePointerEquivalences() {
+ DOUT << "Uniting remaining pointer equivalences\n";
+ for (unsigned i = 0; i < GraphNodes.size(); ++i) {
+ if (GraphNodes[i].AddressTaken && GraphNodes[i].NodeRep == SelfRep) {
+ unsigned Label = GraphNodes[i].PointerEquivLabel;
+
+ if (Label && PENLEClass2Node[Label] != -1)
+ UniteNodes(i, PENLEClass2Node[Label]);
+ }
+ }
+ DOUT << "Finished remaining pointer equivalences\n";
+ PENLEClass2Node.clear();
+}
+
+/// Create the constraint graph used for solving points-to analysis.
+///
void Andersens::CreateConstraintGraph() {
for (unsigned i = 0, e = Constraints.size(); i != e; ++i) {
Constraint &C = Constraints[i];
@@ -1181,8 +2003,13 @@
bool Changed = true;
unsigned Iteration = 0;
- // We create the bitmaps here to avoid getting jerked around by the compiler
- // creating objects behind our back and wasting lots of memory.
+ OptimizeConstraints();
+#undef DEBUG_TYPE
+#define DEBUG_TYPE "anders-aa-constraints"
+ DEBUG(PrintConstraints());
+#undef DEBUG_TYPE
+#define DEBUG_TYPE "anders-aa"
+
for (unsigned i = 0; i < GraphNodes.size(); ++i) {
Node *N = &GraphNodes[i];
N->PointsTo = new SparseBitVector<>;
@@ -1190,9 +2017,12 @@
N->Edges = new SparseBitVector<>;
}
CreateConstraintGraph();
-
+ UnitePointerEquivalences();
+ assert(SCCStack.empty() && "SCC Stack should be empty by now!");
Topo2Node.insert(Topo2Node.begin(), GraphNodes.size(), Unvisited);
Node2Topo.insert(Node2Topo.begin(), GraphNodes.size(), Unvisited);
+ Node2DFS.clear();
+ Node2Deleted.clear();
Node2DFS.insert(Node2DFS.begin(), GraphNodes.size(), 0);
Node2Deleted.insert(Node2Deleted.begin(), GraphNodes.size(), false);
DFSNumber = 0;
@@ -1214,7 +2044,7 @@
do {
Changed = false;
++NumIters;
- DOUT << "Starting iteration #" << Iteration++;
+ DOUT << "Starting iteration #" << Iteration++ << "\n";
// TODO: In the microoptimization category, we could just make Topo2Node
// a fast map and thus only contain the visited nodes.
for (unsigned i = 0; i < GraphNodes.size(); ++i) {
@@ -1248,7 +2078,7 @@
// TODO: We could delete redundant constraints here.
// Src and Dest will be the vars we are going to process.
// This may look a bit ugly, but what it does is allow us to process
- // both store and load constraints with the same function.
+ // both store and load constraints with the same code.
// Load constraints say that every member of our RHS solution has K
// added to it, and that variable gets an edge to LHS. We also union
// RHS+K's solution into the LHS solution.
@@ -1282,11 +2112,10 @@
CurrMember = *bi;
// Need to increment the member by K since that is where we are
- // supposed to copy to/from
- // Node that in positive weight cycles, which occur in address taking
- // of fields, K can go past
- // MaxK[CurrMember] elements, even though that is all it could
- // point to.
+ // supposed to copy to/from. Note that in positive weight cycles,
+ // which occur in address taking of fields, K can go past
+ // MaxK[CurrMember] elements, even though that is all it could point
+ // to.
if (K > 0 && K > MaxK[CurrMember])
continue;
else
@@ -1393,12 +2222,17 @@
SecondNode->NodeRep = First;
FirstNode->Changed |= SecondNode->Changed;
- FirstNode->PointsTo |= *(SecondNode->PointsTo);
- FirstNode->Edges |= *(SecondNode->Edges);
- FirstNode->Constraints.splice(FirstNode->Constraints.begin(),
- SecondNode->Constraints);
- delete FirstNode->OldPointsTo;
- FirstNode->OldPointsTo = new SparseBitVector<>;
+ if (FirstNode->PointsTo && SecondNode->PointsTo)
+ FirstNode->PointsTo |= *(SecondNode->PointsTo);
+ if (FirstNode->Edges && SecondNode->Edges)
+ FirstNode->Edges |= *(SecondNode->Edges);
+ if (!FirstNode->Constraints.empty() && !SecondNode->Constraints.empty())
+ FirstNode->Constraints.splice(FirstNode->Constraints.begin(),
+ SecondNode->Constraints);
+ if (FirstNode->OldPointsTo) {
+ delete FirstNode->OldPointsTo;
+ FirstNode->OldPointsTo = new SparseBitVector<>;
+ }
// Destroy interesting parts of the merged-from node.
delete SecondNode->OldPointsTo;
@@ -1479,35 +2313,36 @@
if (N == &GraphNodes[getObject(V)])
cerr << "";
}
+void Andersens::PrintConstraint(const Constraint &C) {
+ if (C.Type == Constraint::Store) {
+ cerr << "*";
+ if (C.Offset != 0)
+ cerr << "(";
+ }
+ PrintNode(&GraphNodes[C.Dest]);
+ if (C.Type == Constraint::Store && C.Offset != 0)
+ cerr << " + " << C.Offset << ")";
+ cerr << " = ";
+ if (C.Type == Constraint::Load) {
+ cerr << "*";
+ if (C.Offset != 0)
+ cerr << "(";
+ }
+ else if (C.Type == Constraint::AddressOf)
+ cerr << "&";
+ PrintNode(&GraphNodes[C.Src]);
+ if (C.Offset != 0 && C.Type != Constraint::Store)
+ cerr << " + " << C.Offset;
+ if (C.Type == Constraint::Load && C.Offset != 0)
+ cerr << ")";
+ cerr << "\n";
+}
void Andersens::PrintConstraints() {
cerr << "Constraints:\n";
- for (unsigned i = 0, e = Constraints.size(); i != e; ++i) {
- const Constraint &C = Constraints[i];
- if (C.Type == Constraint::Store) {
- cerr << "*";
- if (C.Offset != 0)
- cerr << "(";
- }
- PrintNode(&GraphNodes[C.Dest]);
- if (C.Type == Constraint::Store && C.Offset != 0)
- cerr << " + " << C.Offset << ")";
- cerr << " = ";
- if (C.Type == Constraint::Load) {
- cerr << "*";
- if (C.Offset != 0)
- cerr << "(";
- }
- else if (C.Type == Constraint::AddressOf)
- cerr << "&";
- PrintNode(&GraphNodes[C.Src]);
- if (C.Offset != 0 && C.Type != Constraint::Store)
- cerr << " + " << C.Offset;
- if (C.Type == Constraint::Load && C.Offset != 0)
- cerr << ")";
- cerr << "\n";
- }
+ for (unsigned i = 0, e = Constraints.size(); i != e; ++i)
+ PrintConstraint(Constraints[i]);
}
void Andersens::PrintPointsToGraph() {
From dpatel at apple.com Mon Sep 24 13:02:42 2007
From: dpatel at apple.com (Devang Patel)
Date: Mon, 24 Sep 2007 20:02:42 -0000
Subject: [llvm-commits] [llvm] r42270 - in /llvm/trunk:
lib/Transforms/Scalar/LICM.cpp
test/Transforms/LICM/2007-09-24-PromoteNullValue.ll
Message-ID: <200709242002.l8OK2ggR030293@zion.cs.uiuc.edu>
Author: dpatel
Date: Mon Sep 24 15:02:42 2007
New Revision: 42270
URL: http://llvm.org/viewvc/llvm-project?rev=42270&view=rev
Log:
Do not promote null values because it may be unsafe to do so.
Added:
llvm/trunk/test/Transforms/LICM/2007-09-24-PromoteNullValue.ll
Modified:
llvm/trunk/lib/Transforms/Scalar/LICM.cpp
Modified: llvm/trunk/lib/Transforms/Scalar/LICM.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LICM.cpp?rev=42270&r1=42269&r2=42270&view=diff
==============================================================================
--- llvm/trunk/lib/Transforms/Scalar/LICM.cpp (original)
+++ llvm/trunk/lib/Transforms/Scalar/LICM.cpp Mon Sep 24 15:02:42 2007
@@ -800,6 +800,10 @@
break;
}
+ // Do not promote null values because it may be unsafe to do so.
+ if (isa(V))
+ PointerOk = false;
+
if (GetElementPtrInst *GEP = dyn_cast(V)) {
// If GEP base is NULL then the calculated address used by Store or
// Load instruction is invalid. Do not promote this value because
Added: llvm/trunk/test/Transforms/LICM/2007-09-24-PromoteNullValue.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LICM/2007-09-24-PromoteNullValue.ll?rev=42270&view=auto
==============================================================================
--- llvm/trunk/test/Transforms/LICM/2007-09-24-PromoteNullValue.ll (added)
+++ llvm/trunk/test/Transforms/LICM/2007-09-24-PromoteNullValue.ll Mon Sep 24 15:02:42 2007
@@ -0,0 +1,46 @@
+; Do not promote null value because it may be unsafe to do so.
+; RUN: llvm-as < %s | opt -licm | llvm-dis | not grep promoted
+
+define i32 @f(i32 %foo, i32 %bar, i32 %com) {
+entry:
+ %tmp2 = icmp eq i32 %foo, 0 ; [#uses=1]
+ br i1 %tmp2, label %cond_next, label %cond_true
+
+cond_true: ; preds = %entry
+ br label %return
+
+cond_next: ; preds = %entry
+ br label %bb
+
+bb: ; preds = %bb15, %cond_next
+ switch i32 %bar, label %bb15 [
+ i32 1, label %bb6
+ ]
+
+bb6: ; preds = %bb
+ %tmp8 = icmp eq i32 %com, 0 ; [#uses=1]
+ br i1 %tmp8, label %cond_next14, label %cond_true11
+
+cond_true11: ; preds = %bb6
+ br label %return
+
+cond_next14: ; preds = %bb6
+ store i8 0, i8* null
+ br label %bb15
+
+bb15: ; preds = %cond_next14, %bb
+ br label %bb
+
+return: ; preds = %cond_true11, %cond_true
+ %storemerge = phi i32 [ 0, %cond_true ], [ undef, %cond_true11 ] ; [#uses=1]
+ ret i32 %storemerge
+}
+
+define i32 @kdMain() {
+entry:
+ %tmp1 = call i32 @f( i32 0, i32 1, i32 1 ) ; [#uses=0]
+ call void @exit( i32 0 )
+ unreachable
+}
+
+declare void @exit(i32)
From baldrick at free.fr Mon Sep 24 13:08:42 2007
From: baldrick at free.fr (Duncan Sands)
Date: Mon, 24 Sep 2007 22:08:42 +0200
Subject: [llvm-commits] [llvm] r42269 - in /llvm/trunk:
include/llvm/ADT/DenseMap.h include/llvm/ADT/SparseBitVector.h
lib/Analysis/IPA/Andersens.cpp
In-Reply-To: <200709241945.l8OJjo1M029802@zion.cs.uiuc.edu>
References: <200709241945.l8OJjo1M029802@zion.cs.uiuc.edu>
Message-ID: <200709242208.46621.baldrick@free.fr>
Hi DannyB,
> @@ -287,6 +286,14 @@
> }
> BecameZero = allzero;
> }
> + // Get a hash value for this element;
> + uint64_t getHashValue() const {
shouldn't there be a blank line before the added lines?
> +// The Offline constraint graph optimization portion includes offline variable
The Offline -> The offline
> +// substitution algorithms intended to pointer and location equivalences.
Does not parse!
> + // graph. Due to various optimizations, not always the case that there is a
not always the case -> it is not always the case
> + // True if our ponits-to set is in the Set2PEClass map
ponits-to -> points-to
> + // True if our node has no indirect constraints (Complex or otherwise)
Complex -> complex
> + // their arg nodes, which must be kept at the same position relative to
> + // their base function node.
> + // kept at the same position relative to their base function node.
Looks like this last line shouldn't be there.
Ciao,
Duncan.
From bruno.cardoso at gmail.com Mon Sep 24 13:15:11 2007
From: bruno.cardoso at gmail.com (Bruno Cardoso Lopes)
Date: Mon, 24 Sep 2007 20:15:11 -0000
Subject: [llvm-commits] [llvm] r42271 - in /llvm/trunk/lib/Target/Mips:
MipsAsmPrinter.cpp MipsISelDAGToDAG.cpp MipsInstrInfo.td
Message-ID: <200709242015.l8OKFBRF030624@zion.cs.uiuc.edu>
Author: bruno
Date: Mon Sep 24 15:15:11 2007
New Revision: 42271
URL: http://llvm.org/viewvc/llvm-project?rev=42271&view=rev
Log:
Added "LoadEffective" pattern to handle stack locations.
Fixed some comments
Modified:
llvm/trunk/lib/Target/Mips/MipsAsmPrinter.cpp
llvm/trunk/lib/Target/Mips/MipsISelDAGToDAG.cpp
llvm/trunk/lib/Target/Mips/MipsInstrInfo.td
Modified: llvm/trunk/lib/Target/Mips/MipsAsmPrinter.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/MipsAsmPrinter.cpp?rev=42271&r1=42270&r2=42271&view=diff
==============================================================================
--- llvm/trunk/lib/Target/Mips/MipsAsmPrinter.cpp (original)
+++ llvm/trunk/lib/Target/Mips/MipsAsmPrinter.cpp Mon Sep 24 15:15:11 2007
@@ -141,7 +141,7 @@
#endif
unsigned int Bitmask = getSavedRegsBitmask(false, MF);
- O << "\t.mask\t";
+ O << "\t.mask \t";
printHex32(Bitmask);
O << "," << Offset << "\n";
}
@@ -366,9 +366,16 @@
void MipsAsmPrinter::
printMemOperand(const MachineInstr *MI, int opNum, const char *Modifier)
{
- // lw/sw $reg, MemOperand
- // will turn into :
- // lw/sw $reg, imm($reg)
+ // when using stack locations for not load/store instructions
+ // print the same way as all normal 3 operand instructions.
+ if (Modifier && !strcmp(Modifier, "stackloc")) {
+ printOperand(MI, opNum+1);
+ O << ", ";
+ printOperand(MI, opNum);
+ return;
+ }
+
+ // Load/Store memory operands -- imm($reg)
printOperand(MI, opNum);
O << "(";
printOperand(MI, opNum+1);
Modified: llvm/trunk/lib/Target/Mips/MipsISelDAGToDAG.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/MipsISelDAGToDAG.cpp?rev=42271&r1=42270&r2=42271&view=diff
==============================================================================
--- llvm/trunk/lib/Target/Mips/MipsISelDAGToDAG.cpp (original)
+++ llvm/trunk/lib/Target/Mips/MipsISelDAGToDAG.cpp Mon Sep 24 15:15:11 2007
@@ -196,8 +196,8 @@
}
///
- // Instruction Selection not handled by custom or by the
- // auto-generated tablegen selection should be handled here
+ // Instruction Selection not handled by the auto-generated
+ // tablegen selection should be handled here.
///
switch(Opcode) {
Modified: llvm/trunk/lib/Target/Mips/MipsInstrInfo.td
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/MipsInstrInfo.td?rev=42271&r1=42270&r2=42271&view=diff
==============================================================================
--- llvm/trunk/lib/Target/Mips/MipsInstrInfo.td (original)
+++ llvm/trunk/lib/Target/Mips/MipsInstrInfo.td Mon Sep 24 15:15:11 2007
@@ -332,6 +332,12 @@
!strconcat(instr_asm, " $dst, $src"),
[], IIAlu>;
+class EffectiveAddress :
+ FI<0x09,
+ (outs CPURegs:$dst),
+ (ins mem:$addr),
+ instr_asm,
+ [(set CPURegs:$dst, addr:$addr)], IIAlu>;
//===----------------------------------------------------------------------===//
// Pseudo instructions
@@ -468,6 +474,12 @@
"jr $target", [(MipsRet CPURegs:$target)], IIBranch>;
}
+// FrameIndexes are legalized when they are operands from load/store
+// instructions. The same not happens for stack address copies, so an
+// add op with mem ComplexPattern is used and the stack address copy
+// can be matched. It's similar to Sparc LEA_ADDRi
+def LEA_ADDiu : EffectiveAddress<"addiu $dst, ${addr:stackloc}">;
+
//===----------------------------------------------------------------------===//
// Arbitrary patterns that map to one or more instructions
//===----------------------------------------------------------------------===//
From djg at cray.com Mon Sep 24 13:36:14 2007
From: djg at cray.com (Dan Gohman)
Date: Mon, 24 Sep 2007 15:36:14 -0500
Subject: [llvm-commits] [llvm] r42270 - in /llvm/trunk:
lib/Transforms/Scalar/LICM.cpp
test/Transforms/LICM/2007-09-24-PromoteNullValue.ll
Message-ID: <20070924203614.GV21991@village.us.cray.com>
> Do not promote null values because it may be unsafe to do so.
Interesting. See our earlier discussion on LICM checking for
NULL constants :-}.
Dan
--
Dan Gohman, Cray Inc.
From djg at cray.com Mon Sep 24 13:58:13 2007
From: djg at cray.com (Dan Gohman)
Date: Mon, 24 Sep 2007 20:58:13 -0000
Subject: [llvm-commits] [llvm] r42272 - in /llvm/trunk:
include/llvm/CodeGen/AsmPrinter.h lib/CodeGen/AsmPrinter.cpp
Message-ID: <200709242058.l8OKwE82032030@zion.cs.uiuc.edu>
Author: djg
Date: Mon Sep 24 15:58:13 2007
New Revision: 42272
URL: http://llvm.org/viewvc/llvm-project?rev=42272&view=rev
Log:
Add a routine for emitting .file directives, for setting up
file numbers to use with .loc directives.
Modified:
llvm/trunk/include/llvm/CodeGen/AsmPrinter.h
llvm/trunk/lib/CodeGen/AsmPrinter.cpp
Modified: llvm/trunk/include/llvm/CodeGen/AsmPrinter.h
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/AsmPrinter.h?rev=42272&r1=42271&r2=42272&view=diff
==============================================================================
--- llvm/trunk/include/llvm/CodeGen/AsmPrinter.h (original)
+++ llvm/trunk/include/llvm/CodeGen/AsmPrinter.h Mon Sep 24 15:58:13 2007
@@ -248,6 +248,9 @@
/// @verbatim (Eg. '\t') @endverbatim
void EmitString(const std::string &String) const;
+ /// EmitFile - Emit a .file directive.
+ void EmitFile(unsigned Number, const std::string &Name) const;
+
//===------------------------------------------------------------------===//
/// EmitAlignment - Emit an alignment directive to the specified power of
Modified: llvm/trunk/lib/CodeGen/AsmPrinter.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter.cpp?rev=42272&r1=42271&r2=42272&view=diff
==============================================================================
--- llvm/trunk/lib/CodeGen/AsmPrinter.cpp (original)
+++ llvm/trunk/lib/CodeGen/AsmPrinter.cpp Mon Sep 24 15:58:13 2007
@@ -618,6 +618,17 @@
}
+/// EmitFile - Emit a .file directive.
+void AsmPrinter::EmitFile(unsigned Number, const std::string &Name) const {
+ O << "\t.file\t" << Number << " \"";
+ for (unsigned i = 0, N = Name.size(); i < N; ++i) {
+ unsigned char C = Name[i];
+ printStringChar(O, C);
+ }
+ O << "\"";
+}
+
+
//===----------------------------------------------------------------------===//
// EmitAlignment - Emit an alignment directive to the specified power of
From dalej at apple.com Mon Sep 24 14:06:09 2007
From: dalej at apple.com (Dale Johannesen)
Date: Mon, 24 Sep 2007 21:06:09 -0000
Subject: [llvm-commits] [llvm] r42273 -
/llvm/trunk/lib/VMCore/ConstantFold.cpp
Message-ID: <200709242106.l8OL69dw032227@zion.cs.uiuc.edu>
Author: johannes
Date: Mon Sep 24 16:06:09 2007
New Revision: 42273
URL: http://llvm.org/viewvc/llvm-project?rev=42273&view=rev
Log:
float->int conversion rounds toward 0. Duh.
Fixes PR1698.
Modified:
llvm/trunk/lib/VMCore/ConstantFold.cpp
Modified: llvm/trunk/lib/VMCore/ConstantFold.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/ConstantFold.cpp?rev=42273&r1=42272&r2=42273&view=diff
==============================================================================
--- llvm/trunk/lib/VMCore/ConstantFold.cpp (original)
+++ llvm/trunk/lib/VMCore/ConstantFold.cpp Mon Sep 24 16:06:09 2007
@@ -196,7 +196,7 @@
uint32_t DestBitWidth = cast(DestTy)->getBitWidth();
APFloat::opStatus status = V.convertToInteger(x, DestBitWidth,
opc==Instruction::FPToSI,
- APFloat::rmNearestTiesToEven);
+ APFloat::rmTowardZero);
if (status!=APFloat::opOK && status!=APFloat::opInexact)
return 0; // give up
APInt Val(DestBitWidth, 2, x);
From djg at cray.com Mon Sep 24 14:09:53 2007
From: djg at cray.com (Dan Gohman)
Date: Mon, 24 Sep 2007 21:09:53 -0000
Subject: [llvm-commits] [llvm] r42274 - in /llvm/trunk:
include/llvm/Target/TargetAsmInfo.h lib/Target/TargetAsmInfo.cpp
Message-ID: <200709242109.l8OL9snq032326@zion.cs.uiuc.edu>
Author: djg
Date: Mon Sep 24 16:09:53 2007
New Revision: 42274
URL: http://llvm.org/viewvc/llvm-project?rev=42274&view=rev
Log:
Merge hasDotLoc and hasDotFile into hasDotLocAndDotFile since .loc and .file
aren't really usable without each other.
Modified:
llvm/trunk/include/llvm/Target/TargetAsmInfo.h
llvm/trunk/lib/Target/TargetAsmInfo.cpp
Modified: llvm/trunk/include/llvm/Target/TargetAsmInfo.h
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetAsmInfo.h?rev=42274&r1=42273&r2=42274&view=diff
==============================================================================
--- llvm/trunk/include/llvm/Target/TargetAsmInfo.h (original)
+++ llvm/trunk/include/llvm/Target/TargetAsmInfo.h Mon Sep 24 16:09:53 2007
@@ -286,14 +286,11 @@
///
bool HasLEB128; // Defaults to false.
- /// hasDotLoc - True if target asm supports .loc directives.
+ /// hasDotLocAndDotFile - True if target asm supports .loc and .file
+ /// directives for emitting debugging information.
///
- bool HasDotLoc; // Defaults to false.
+ bool HasDotLocAndDotFile; // Defaults to false.
- /// HasDotFile - True if target asm supports .file directives.
- ///
- bool HasDotFile; // Defaults to false.
-
/// SupportsDebugInformation - True if target supports emission of debugging
/// information.
bool SupportsDebugInformation;
@@ -568,11 +565,8 @@
bool hasLEB128() const {
return HasLEB128;
}
- bool hasDotLoc() const {
- return HasDotLoc;
- }
- bool hasDotFile() const {
- return HasDotFile;
+ bool hasDotLocAndDotFile() const {
+ return HasDotLocAndDotFile;
}
bool doesSupportDebugInformation() const {
return SupportsDebugInformation;
Modified: llvm/trunk/lib/Target/TargetAsmInfo.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/TargetAsmInfo.cpp?rev=42274&r1=42273&r2=42274&view=diff
==============================================================================
--- llvm/trunk/lib/Target/TargetAsmInfo.cpp (original)
+++ llvm/trunk/lib/Target/TargetAsmInfo.cpp Mon Sep 24 16:09:53 2007
@@ -81,8 +81,7 @@
AbsoluteDebugSectionOffsets(false),
AbsoluteEHSectionOffsets(false),
HasLEB128(false),
- HasDotLoc(false),
- HasDotFile(false),
+ HasDotLocAndDotFile(false),
SupportsDebugInformation(false),
SupportsExceptionHandling(false),
DwarfRequiresFrameSection(true),
From djg at cray.com Mon Sep 24 14:32:19 2007
From: djg at cray.com (Dan Gohman)
Date: Mon, 24 Sep 2007 21:32:19 -0000
Subject: [llvm-commits] [llvm] r42275 -
/llvm/trunk/lib/CodeGen/DwarfWriter.cpp
Message-ID: <200709242132.l8OLWK1u000420@zion.cs.uiuc.edu>
Author: djg
Date: Mon Sep 24 16:32:18 2007
New Revision: 42275
URL: http://llvm.org/viewvc/llvm-project?rev=42275&view=rev
Log:
Add support for emitting .file directives to set up file numbers for
use with .loc directives.
Modified:
llvm/trunk/lib/CodeGen/DwarfWriter.cpp
Modified: llvm/trunk/lib/CodeGen/DwarfWriter.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/DwarfWriter.cpp?rev=42275&r1=42274&r2=42275&view=diff
==============================================================================
--- llvm/trunk/lib/CodeGen/DwarfWriter.cpp (original)
+++ llvm/trunk/lib/CodeGen/DwarfWriter.cpp Mon Sep 24 16:32:18 2007
@@ -28,6 +28,7 @@
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/DataTypes.h"
#include "llvm/Support/Mangler.h"
+#include "llvm/System/Path.h"
#include "llvm/Target/TargetAsmInfo.h"
#include "llvm/Target/MRegisterInfo.h"
#include "llvm/Target/TargetData.h"
@@ -1967,6 +1968,19 @@
if (didInitial) return;
didInitial = true;
+ // Print out .file directives to specify files for .loc directives.
+ if (TAI->hasDotLocAndDotFile()) {
+ const UniqueVector &SourceFiles = MMI->getSourceFiles();
+ const UniqueVector &Directories = MMI->getDirectories();
+ for (unsigned i = 1, e = SourceFiles.size(); i <= e; ++i) {
+ sys::Path FullPath(Directories[SourceFiles[i].getDirectoryID()]);
+ bool AppendOk = FullPath.appendComponent(SourceFiles[i].getName());
+ assert(AppendOk && "Could not append filename to directory!");
+ Asm->EmitFile(i, FullPath.toString());
+ Asm->EOL();
+ }
+ }
+
// Dwarf sections base addresses.
if (TAI->doesDwarfRequireFrameSection()) {
Asm->SwitchToDataSection(TAI->getDwarfFrameSection());
From djg at cray.com Mon Sep 24 14:36:21 2007
From: djg at cray.com (Dan Gohman)
Date: Mon, 24 Sep 2007 21:36:21 -0000
Subject: [llvm-commits] [llvm] r42276 -
/llvm/trunk/lib/CodeGen/DwarfWriter.cpp
Message-ID: <200709242136.l8OLaMwe000514@zion.cs.uiuc.edu>
Author: djg
Date: Mon Sep 24 16:36:21 2007
New Revision: 42276
URL: http://llvm.org/viewvc/llvm-project?rev=42276&view=rev
Log:
Move the Asm->EOL() call for EmitDebugARanges outside the #ifdef for
consistency with the other currently empty sections.
Modified:
llvm/trunk/lib/CodeGen/DwarfWriter.cpp
Modified: llvm/trunk/lib/CodeGen/DwarfWriter.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/DwarfWriter.cpp?rev=42276&r1=42275&r2=42276&view=diff
==============================================================================
--- llvm/trunk/lib/CodeGen/DwarfWriter.cpp (original)
+++ llvm/trunk/lib/CodeGen/DwarfWriter.cpp Mon Sep 24 16:36:21 2007
@@ -2525,9 +2525,9 @@
Asm->EmitInt32(0); Asm->EOL("EOM (1)");
Asm->EmitInt32(0); Asm->EOL("EOM (2)");
+ #endif
Asm->EOL();
- #endif
}
/// EmitDebugRanges - Emit visible names into a debug ranges section.
From djg at cray.com Mon Sep 24 14:43:52 2007
From: djg at cray.com (Dan Gohman)
Date: Mon, 24 Sep 2007 21:43:52 -0000
Subject: [llvm-commits] [llvm] r42277 -
/llvm/trunk/lib/CodeGen/DwarfWriter.cpp
Message-ID: <200709242143.l8OLhqCU000698@zion.cs.uiuc.edu>
Author: djg
Date: Mon Sep 24 16:43:52 2007
New Revision: 42277
URL: http://llvm.org/viewvc/llvm-project?rev=42277&view=rev
Log:
Don't emit .debug_line header data if there aren't any lines to put in it,
such as will happen when .loc directives are used.
Modified:
llvm/trunk/lib/CodeGen/DwarfWriter.cpp
Modified: llvm/trunk/lib/CodeGen/DwarfWriter.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/DwarfWriter.cpp?rev=42277&r1=42276&r2=42277&view=diff
==============================================================================
--- llvm/trunk/lib/CodeGen/DwarfWriter.cpp (original)
+++ llvm/trunk/lib/CodeGen/DwarfWriter.cpp Mon Sep 24 16:43:52 2007
@@ -2196,6 +2196,11 @@
/// EmitDebugLines - Emit source line information.
///
void EmitDebugLines() {
+ // If there are no lines to emit (such as when we're using .loc directives
+ // to emit .debug_line information) don't emit a .debug_line header.
+ if (SectionSourceLines.empty())
+ return;
+
// Minimum line delta, thus ranging from -10..(255-10).
const int MinLineDelta = -(DW_LNS_fixed_advance_pc + 1);
// Maximum line delta, thus ranging from -10..(255-10).
From djg at cray.com Mon Sep 24 14:54:15 2007
From: djg at cray.com (Dan Gohman)
Date: Mon, 24 Sep 2007 21:54:15 -0000
Subject: [llvm-commits] [llvm] r42278 -
/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
Message-ID: <200709242154.l8OLsF56000956@zion.cs.uiuc.edu>
Author: djg
Date: Mon Sep 24 16:54:14 2007
New Revision: 42278
URL: http://llvm.org/viewvc/llvm-project?rev=42278&view=rev
Log:
Add support on x86 for having Legalize lower ISD::LOCATION to ISD::DEBUG_LOC
instead of ISD::LABEL with a manual .debug_line entry when the assembler
supports .file and .loc directives.
Modified:
llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=42278&r1=42277&r2=42278&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)
+++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Mon Sep 24 16:54:14 2007
@@ -32,6 +32,7 @@
#include "llvm/CodeGen/SelectionDAG.h"
#include "llvm/CodeGen/SSARegMap.h"
#include "llvm/Support/MathExtras.h"
+#include "llvm/Target/TargetAsmInfo.h"
#include "llvm/Target/TargetOptions.h"
#include "llvm/ADT/StringExtras.h"
#include "llvm/ParameterAttributes.h"
@@ -229,9 +230,13 @@
setOperationAction(ISD::MEMSET , MVT::Other, Custom);
setOperationAction(ISD::MEMCPY , MVT::Other, Custom);
- // We don't have line number support yet.
+ // Use the default ISD::LOCATION expansion, and tell Legalize it's
+ // ok to use DEBUG_LOC if we have an assembler that supports it.
setOperationAction(ISD::LOCATION, MVT::Other, Expand);
- setOperationAction(ISD::DEBUG_LOC, MVT::Other, Expand);
+ if (TM.getTargetAsmInfo()->hasDotLocAndDotFile())
+ setOperationAction(ISD::DEBUG_LOC, MVT::Other, Legal);
+ else
+ setOperationAction(ISD::DEBUG_LOC, MVT::Other, Expand);
// FIXME - use subtarget debug flags
if (!Subtarget->isTargetDarwin() &&
!Subtarget->isTargetELF() &&
From dberlin at dberlin.org Mon Sep 24 15:20:43 2007
From: dberlin at dberlin.org (Daniel Berlin)
Date: Mon, 24 Sep 2007 18:20:43 -0400
Subject: [llvm-commits] [llvm] r42269 - in /llvm/trunk:
include/llvm/ADT/DenseMap.h include/llvm/ADT/SparseBitVector.h
lib/Analysis/IPA/Andersens.cpp
In-Reply-To: <200709242208.46621.baldrick@free.fr>
References: <200709241945.l8OJjo1M029802@zion.cs.uiuc.edu>
<200709242208.46621.baldrick@free.fr>
Message-ID: <4aca3dc20709241520g21478a4bn83802c5d35c03cc5@mail.gmail.com>
On 9/24/07, Duncan Sands wrote:
> Hi DannyB,
>
> > @@ -287,6 +286,14 @@
> > }
> > BecameZero = allzero;
> > }
> > + // Get a hash value for this element;
> > + uint64_t getHashValue() const {
>
> shouldn't there be a blank line before the added lines?
>
> > +// The Offline constraint graph optimization portion includes offline variable
>
> The Offline -> The offline
>
> > +// substitution algorithms intended to pointer and location equivalences.
>
> Does not parse!
>
> > + // graph. Due to various optimizations, not always the case that there is a
>
> not always the case -> it is not always the case
>
> > + // True if our ponits-to set is in the Set2PEClass map
>
> ponits-to -> points-to
>
> > + // True if our node has no indirect constraints (Complex or otherwise)
>
> Complex -> complex
>
> > + // their arg nodes, which must be kept at the same position relative to
> > + // their base function node.
> > + // kept at the same position relative to their base function node.
>
> Looks like this last line shouldn't be there.
All fixed
>
> Ciao,
>
> Duncan.
>
From dberlin at dberlin.org Mon Sep 24 15:20:45 2007
From: dberlin at dberlin.org (Daniel Berlin)
Date: Mon, 24 Sep 2007 22:20:45 -0000
Subject: [llvm-commits] [llvm] r42279 - in /llvm/trunk:
include/llvm/ADT/SparseBitVector.h lib/Analysis/IPA/Andersens.cpp
Message-ID: <200709242220.l8OMKjYP001578@zion.cs.uiuc.edu>
Author: dannyb
Date: Mon Sep 24 17:20:45 2007
New Revision: 42279
URL: http://llvm.org/viewvc/llvm-project?rev=42279&view=rev
Log:
Comment fixups
Modified:
llvm/trunk/include/llvm/ADT/SparseBitVector.h
llvm/trunk/lib/Analysis/IPA/Andersens.cpp
Modified: llvm/trunk/include/llvm/ADT/SparseBitVector.h
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ADT/SparseBitVector.h?rev=42279&r1=42278&r2=42279&view=diff
==============================================================================
--- llvm/trunk/include/llvm/ADT/SparseBitVector.h (original)
+++ llvm/trunk/include/llvm/ADT/SparseBitVector.h Mon Sep 24 17:20:45 2007
@@ -286,6 +286,7 @@
}
BecameZero = allzero;
}
+
// Get a hash value for this element;
uint64_t getHashValue() const {
uint64_t HashVal = 0;
Modified: llvm/trunk/lib/Analysis/IPA/Andersens.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/IPA/Andersens.cpp?rev=42279&r1=42278&r2=42279&view=diff
==============================================================================
--- llvm/trunk/lib/Analysis/IPA/Andersens.cpp (original)
+++ llvm/trunk/lib/Analysis/IPA/Andersens.cpp Mon Sep 24 17:20:45 2007
@@ -30,11 +30,11 @@
// B can point to. Constraints can handle copies, loads, and stores, and
// address taking.
//
-// The Offline constraint graph optimization portion includes offline variable
-// substitution algorithms intended to pointer and location equivalences.
-// Pointer equivalences are those pointers that will have the same points-to
-// sets, and location equivalences are those variables that always appear
-// together in points-to sets.
+// The offline constraint graph optimization portion includes offline variable
+// substitution algorithms intended to computer pointer and location
+// equivalences. Pointer equivalences are those pointers that will have the
+// same points-to sets, and location equivalences are those variables that
+// always appear together in points-to sets.
//
// The inclusion constraint solving phase iteratively propagates the inclusion
// constraints until a fixed point is reached. This is an O(N^3) algorithm.
@@ -137,10 +137,10 @@
};
// Node class - This class is used to represent a node in the constraint
- // graph. Due to various optimizations, not always the case that there is a
- // mapping from a Node to a Value. In particular, we add artificial Node's
- // that represent the set of pointed-to variables shared for each location
- // equivalent Node.
+ // graph. Due to various optimizations, it is not always the case that
+ // there is a mapping from a Node to a Value. In particular, we add
+ // artificial Node's that represent the set of pointed-to variables shared
+ // for each location equivalent Node.
struct Node {
Value *Val;
SparseBitVector<> *Edges;
@@ -160,15 +160,14 @@
// Number of incoming edges, used during variable substitution to early
// free the points-to sets
unsigned NumInEdges;
- // True if our ponits-to set is in the Set2PEClass map
+ // True if our points-to set is in the Set2PEClass map
bool StoredInHash;
- // True if our node has no indirect constraints (Complex or otherwise)
+ // True if our node has no indirect constraints (complex or otherwise)
bool Direct;
// True if the node is address taken, *or* it is part of a group of nodes
// that must be kept together. This is set to true for functions and
// their arg nodes, which must be kept at the same position relative to
// their base function node.
- // kept at the same position relative to their base function node.
bool AddressTaken;
// Nodes in cycles (or in equivalence classes) are united together using a
From isanbard at gmail.com Mon Sep 24 15:43:49 2007
From: isanbard at gmail.com (Bill Wendling)
Date: Mon, 24 Sep 2007 22:43:49 -0000
Subject: [llvm-commits] [llvm] r42280 -
/llvm/trunk/lib/Analysis/IPA/Andersens.cpp
Message-ID: <200709242243.l8OMhnDu002217@zion.cs.uiuc.edu>
Author: void
Date: Mon Sep 24 17:43:48 2007
New Revision: 42280
URL: http://llvm.org/viewvc/llvm-project?rev=42280&view=rev
Log:
Don't execute dump unless NDEBUG isn't defined.
Modified:
llvm/trunk/lib/Analysis/IPA/Andersens.cpp
Modified: llvm/trunk/lib/Analysis/IPA/Andersens.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/IPA/Andersens.cpp?rev=42280&r1=42279&r2=42280&view=diff
==============================================================================
--- llvm/trunk/lib/Analysis/IPA/Andersens.cpp (original)
+++ llvm/trunk/lib/Analysis/IPA/Andersens.cpp Mon Sep 24 17:43:48 2007
@@ -1185,7 +1185,9 @@
}
void dumpToDOUT(SparseBitVector<> *bitmap) {
+#ifndef NDEBUG
dump(*bitmap, DOUT);
+#endif
}
From dalej at apple.com Mon Sep 24 15:52:39 2007
From: dalej at apple.com (Dale Johannesen)
Date: Mon, 24 Sep 2007 22:52:39 -0000
Subject: [llvm-commits] [llvm] r42281 -
/llvm/trunk/lib/Target/X86/X86ISelDAGToDAG.cpp
Message-ID: <200709242252.l8OMqdq3002464@zion.cs.uiuc.edu>
Author: johannes
Date: Mon Sep 24 17:52:39 2007
New Revision: 42281
URL: http://llvm.org/viewvc/llvm-project?rev=42281&view=rev
Log:
When mixing SSE and x87 codegen, it's possible to
have situations where an SSE instruction turns into
multiple blocks, with the live range of an x87
register crossing them. To do this correctly make
sure we examine all blocks when inserting
FP_REG_KILL. PR 1697. (This was exposed by my
fix for PR 1681, but the same thing could happen
mixing x87 long double with SSE.)
Modified:
llvm/trunk/lib/Target/X86/X86ISelDAGToDAG.cpp
Modified: llvm/trunk/lib/Target/X86/X86ISelDAGToDAG.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelDAGToDAG.cpp?rev=42281&r1=42280&r2=42281&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86ISelDAGToDAG.cpp (original)
+++ llvm/trunk/lib/Target/X86/X86ISelDAGToDAG.cpp Mon Sep 24 17:52:39 2007
@@ -482,14 +482,17 @@
// block defines any FP values. If so, put an FP_REG_KILL instruction before
// the terminator of the block.
- // Note that FP stack instructions *are* used in SSE code for long double,
- // so we do need this check.
- bool ContainsFPCode = false;
+ // Note that FP stack instructions are used in all modes for long double,
+ // so we always need to do this check.
+ // Also note that it's possible for an FP stack register to be live across
+ // an instruction that produces multiple basic blocks (SSE CMOV) so we
+ // must check all the generated basic blocks.
// Scan all of the machine instructions in these MBBs, checking for FP
// stores. (RFP32 and RFP64 will not exist in SSE mode, but RFP80 might.)
MachineFunction::iterator MBBI = FirstMBB;
do {
+ bool ContainsFPCode = false;
for (MachineBasicBlock::iterator I = MBBI->begin(), E = MBBI->end();
!ContainsFPCode && I != E; ++I) {
if (I->getNumOperands() != 0 && I->getOperand(0).isRegister()) {
@@ -507,35 +510,34 @@
}
}
}
- } while (!ContainsFPCode && &*(MBBI++) != BB);
-
- // Check PHI nodes in successor blocks. These PHI's will be lowered to have
- // a copy of the input value in this block. In SSE mode, we only care about
- // 80-bit values.
- if (!ContainsFPCode) {
- // Final check, check LLVM BB's that are successors to the LLVM BB
- // corresponding to BB for FP PHI nodes.
- const BasicBlock *LLVMBB = BB->getBasicBlock();
- const PHINode *PN;
- for (succ_const_iterator SI = succ_begin(LLVMBB), E = succ_end(LLVMBB);
- !ContainsFPCode && SI != E; ++SI) {
- for (BasicBlock::const_iterator II = SI->begin();
- (PN = dyn_cast(II)); ++II) {
- if (PN->getType()==Type::X86_FP80Ty ||
- (!Subtarget->hasSSE2() && PN->getType()->isFloatingPoint())) {
- ContainsFPCode = true;
- break;
+ // Check PHI nodes in successor blocks. These PHI's will be lowered to have
+ // a copy of the input value in this block. In SSE mode, we only care about
+ // 80-bit values.
+ if (!ContainsFPCode) {
+ // Final check, check LLVM BB's that are successors to the LLVM BB
+ // corresponding to BB for FP PHI nodes.
+ const BasicBlock *LLVMBB = BB->getBasicBlock();
+ const PHINode *PN;
+ for (succ_const_iterator SI = succ_begin(LLVMBB), E = succ_end(LLVMBB);
+ !ContainsFPCode && SI != E; ++SI) {
+ for (BasicBlock::const_iterator II = SI->begin();
+ (PN = dyn_cast(II)); ++II) {
+ if (PN->getType()==Type::X86_FP80Ty ||
+ (!Subtarget->hasSSE1() && PN->getType()->isFloatingPoint()) ||
+ (!Subtarget->hasSSE2() && PN->getType()==Type::DoubleTy)) {
+ ContainsFPCode = true;
+ break;
+ }
}
}
}
- }
-
- // Finally, if we found any FP code, emit the FP_REG_KILL instruction.
- if (ContainsFPCode) {
- BuildMI(*BB, BB->getFirstTerminator(),
- TM.getInstrInfo()->get(X86::FP_REG_KILL));
- ++NumFPKill;
- }
+ // Finally, if we found any FP code, emit the FP_REG_KILL instruction.
+ if (ContainsFPCode) {
+ BuildMI(*MBBI, MBBI->getFirstTerminator(),
+ TM.getInstrInfo()->get(X86::FP_REG_KILL));
+ ++NumFPKill;
+ }
+ } while (&*(MBBI++) != BB);
}
/// MatchAddress - Add the specified node to the specified addressing mode,
From evan.cheng at apple.com Mon Sep 24 18:49:00 2007
From: evan.cheng at apple.com (Evan Cheng)
Date: Tue, 25 Sep 2007 01:49:00 -0000
Subject: [llvm-commits] [llvm] r42282 - in /llvm/trunk:
lib/Target/TargetSelectionDAG.td utils/TableGen/DAGISelEmitter.cpp
Message-ID: <200709250149.l8P1n0ap008699@zion.cs.uiuc.edu>
Author: evancheng
Date: Mon Sep 24 20:48:59 2007
New Revision: 42282
URL: http://llvm.org/viewvc/llvm-project?rev=42282&view=rev
Log:
Rename keyword "modify" -> "implicit".
Modified:
llvm/trunk/lib/Target/TargetSelectionDAG.td
llvm/trunk/utils/TableGen/DAGISelEmitter.cpp
Modified: llvm/trunk/lib/Target/TargetSelectionDAG.td
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/TargetSelectionDAG.td?rev=42282&r1=42281&r2=42282&view=diff
==============================================================================
--- llvm/trunk/lib/Target/TargetSelectionDAG.td (original)
+++ llvm/trunk/lib/Target/TargetSelectionDAG.td Mon Sep 24 20:48:59 2007
@@ -197,7 +197,7 @@
}
def set;
-def modify;
+def implicit;
def parallel;
def node;
def srcvalue;
Modified: llvm/trunk/utils/TableGen/DAGISelEmitter.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/DAGISelEmitter.cpp?rev=42282&r1=42281&r2=42282&view=diff
==============================================================================
--- llvm/trunk/utils/TableGen/DAGISelEmitter.cpp (original)
+++ llvm/trunk/utils/TableGen/DAGISelEmitter.cpp Mon Sep 24 20:48:59 2007
@@ -691,7 +691,7 @@
MadeChange |= UpdateNodeType(MVT::isVoid, TP);
}
return MadeChange;
- } else if (getOperator()->getName() == "modify" ||
+ } else if (getOperator()->getName() == "implicit" ||
getOperator()->getName() == "parallel") {
bool MadeChange = false;
for (unsigned i = 0; i < getNumChildren(); ++i)
@@ -976,7 +976,7 @@
!Operator->isSubClassOf("SDNodeXForm") &&
!Operator->isSubClassOf("Intrinsic") &&
Operator->getName() != "set" &&
- Operator->getName() != "modify" &&
+ Operator->getName() != "implicit" &&
Operator->getName() != "parallel")
error("Unrecognized node '" + Operator->getName() + "'!");
@@ -1385,15 +1385,15 @@
if (!isUse && Pat->getTransformFn())
I->error("Cannot specify a transform function for a non-input value!");
return;
- } else if (Pat->getOperator()->getName() == "modify") {
+ } else if (Pat->getOperator()->getName() == "implicit") {
for (unsigned i = 0, e = Pat->getNumChildren(); i != e; ++i) {
TreePatternNode *Dest = Pat->getChild(i);
if (!Dest->isLeaf())
- I->error("modify value should be a register!");
+ I->error("implicitly defined value should be a register!");
DefInit *Val = dynamic_cast(Dest->getLeafValue());
if (!Val || !Val->getDef()->isSubClassOf("Register"))
- I->error("modify value should be a register!");
+ I->error("implicitly defined value should be a register!");
InstImpResults.push_back(Val->getDef());
}
return;
@@ -2789,7 +2789,7 @@
CodeGenInstruction &II = CGT.getInstruction(Op->getName());
const DAGInstruction &Inst = ISE.getInstruction(Op);
TreePattern *InstPat = Inst.getPattern();
- // FIXME: Assume actual pattern comes before "modify".
+ // FIXME: Assume actual pattern comes before "implicit".
TreePatternNode *InstPatNode =
isRoot ? (InstPat ? InstPat->getTree(0) : Pattern)
: (InstPat ? InstPat->getTree(0) : NULL);
From evan.cheng at apple.com Mon Sep 24 18:50:04 2007
From: evan.cheng at apple.com (Evan Cheng)
Date: Tue, 25 Sep 2007 01:50:04 -0000
Subject: [llvm-commits] [llvm] r42283 - in /llvm/trunk:
include/llvm/Target/TargetOptions.h lib/Target/TargetMachine.cpp
Message-ID: <200709250150.l8P1o4KI008745@zion.cs.uiuc.edu>
Author: evancheng
Date: Mon Sep 24 20:50:04 2007
New Revision: 42283
URL: http://llvm.org/viewvc/llvm-project?rev=42283&view=rev
Log:
New temporary option -new-cc-modeling-scheme to test the new cc modeling scheme.
Modified:
llvm/trunk/include/llvm/Target/TargetOptions.h
llvm/trunk/lib/Target/TargetMachine.cpp
Modified: llvm/trunk/include/llvm/Target/TargetOptions.h
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetOptions.h?rev=42283&r1=42282&r2=42283&view=diff
==============================================================================
--- llvm/trunk/include/llvm/Target/TargetOptions.h (original)
+++ llvm/trunk/include/llvm/Target/TargetOptions.h Mon Sep 24 20:50:04 2007
@@ -73,6 +73,10 @@
/// ExceptionHandling - This flag indicates that exception information should
/// be emitted.
extern bool ExceptionHandling;
+
+ /// NewCCModeling - This temporary flag indicates whether to use the new
+ /// condition code modeling scheme.
+ extern bool NewCCModeling;
} // End llvm namespace
Modified: llvm/trunk/lib/Target/TargetMachine.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/TargetMachine.cpp?rev=42283&r1=42282&r2=42283&view=diff
==============================================================================
--- llvm/trunk/lib/Target/TargetMachine.cpp (original)
+++ llvm/trunk/lib/Target/TargetMachine.cpp Mon Sep 24 20:50:04 2007
@@ -31,6 +31,7 @@
bool UseSoftFloat;
bool NoZerosInBSS;
bool ExceptionHandling;
+ bool NewCCModeling;
Reloc::Model RelocationModel;
CodeModel::Model CMModel;
}
@@ -116,6 +117,11 @@
clEnumValN(CodeModel::Large, "large",
" Large code model"),
clEnumValEnd));
+ cl::opt
+ EnableNewCCModeling("new-cc-modeling-scheme",
+ cl::desc("New CC modeling scheme."),
+ cl::location(NewCCModeling),
+ cl::init(false));
}
//---------------------------------------------------------------------------
From evan.cheng at apple.com Mon Sep 24 18:54:36 2007
From: evan.cheng at apple.com (Evan Cheng)
Date: Tue, 25 Sep 2007 01:54:36 -0000
Subject: [llvm-commits] [llvm] r42284 - in /llvm/trunk:
include/llvm/CodeGen/ScheduleDAG.h lib/CodeGen/SelectionDAG/ScheduleDAG.cpp
lib/CodeGen/SelectionDAG/ScheduleDAGList.cpp
lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
lib/CodeGen/SelectionDAG/ScheduleDAGSimple.cpp
lib/CodeGen/SelectionDAG/SelectionDAGPrinter.cpp
Message-ID: <200709250154.l8P1sa0K008858@zion.cs.uiuc.edu>
Author: evancheng
Date: Mon Sep 24 20:54:36 2007
New Revision: 42284
URL: http://llvm.org/viewvc/llvm-project?rev=42284&view=rev
Log:
Added major new capabilities to scheduler (only BURR for now) to support physical register dependency. The BURR scheduler can now backtrace and duplicate instructions in order to avoid "expensive / impossible to copy" values (e.g. status flag EFLAGS for x86) from being clobbered.
Modified:
llvm/trunk/include/llvm/CodeGen/ScheduleDAG.h
llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAG.cpp
llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGList.cpp
llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGSimple.cpp
llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGPrinter.cpp
Modified: llvm/trunk/include/llvm/CodeGen/ScheduleDAG.h
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/ScheduleDAG.h?rev=42284&r1=42283&r2=42284&view=diff
==============================================================================
--- llvm/trunk/include/llvm/CodeGen/ScheduleDAG.h (original)
+++ llvm/trunk/include/llvm/CodeGen/ScheduleDAG.h Mon Sep 24 20:54:36 2007
@@ -79,12 +79,13 @@
/// SDep - Scheduling dependency. It keeps track of dependent nodes,
/// cost of the depdenency, etc.
struct SDep {
- SUnit *Dep; // Dependent - either a predecessor or a successor.
- bool isCtrl; // True iff it's a control dependency.
- unsigned PhyReg; // If non-zero, this dep is a phy register dependency.
- int Cost; // Cost of the dependency.
- SDep(SUnit *d, bool c, unsigned r, int t)
- : Dep(d), isCtrl(c), PhyReg(r), Cost(t) {}
+ SUnit *Dep; // Dependent - either a predecessor or a successor.
+ unsigned Reg; // If non-zero, this dep is a phy register dependency.
+ int Cost; // Cost of the dependency.
+ bool isCtrl : 1; // True iff it's a control dependency.
+ bool isSpecial : 1; // True iff it's a special ctrl dep added during sched.
+ SDep(SUnit *d, unsigned r, int t, bool c, bool s)
+ : Dep(d), Reg(r), Cost(t), isCtrl(c), isSpecial(s) {}
};
/// SUnit - Scheduling unit. It's an wrapper around either a single SDNode or
@@ -92,6 +93,8 @@
struct SUnit {
SDNode *Node; // Representative node.
SmallVector FlaggedNodes;// All nodes flagged to Node.
+ unsigned InstanceNo; // Instance#. One SDNode can be multiple
+ // SUnit due to cloning.
// Preds/Succs - The SUnits before/after us in the graph. The boolean value
// is true if the edge is a token chain edge, false if it is a value edge.
@@ -103,6 +106,8 @@
typedef SmallVector::const_iterator const_pred_iterator;
typedef SmallVector::const_iterator const_succ_iterator;
+ unsigned NodeNum; // Entry # of node in the node vector.
+ unsigned short Latency; // Node latency.
short NumPreds; // # of preds.
short NumSuccs; // # of sucss.
short NumPredsLeft; // # of preds not scheduled.
@@ -111,42 +116,94 @@
short NumChainSuccsLeft; // # of chain succs not scheduled.
bool isTwoAddress : 1; // Is a two-address instruction.
bool isCommutable : 1; // Is a commutable instruction.
+ bool hasImplicitDefs : 1; // Has implicit physical reg defs.
bool isPending : 1; // True once pending.
bool isAvailable : 1; // True once available.
bool isScheduled : 1; // True once scheduled.
- unsigned short Latency; // Node latency.
unsigned CycleBound; // Upper/lower cycle to be scheduled at.
unsigned Cycle; // Once scheduled, the cycle of the op.
unsigned Depth; // Node depth;
unsigned Height; // Node height;
- unsigned NodeNum; // Entry # of node in the node vector.
SUnit(SDNode *node, unsigned nodenum)
- : Node(node), NumPreds(0), NumSuccs(0), NumPredsLeft(0), NumSuccsLeft(0),
+ : Node(node), InstanceNo(0), NodeNum(nodenum), Latency(0),
+ NumPreds(0), NumSuccs(0), NumPredsLeft(0), NumSuccsLeft(0),
NumChainPredsLeft(0), NumChainSuccsLeft(0),
- isTwoAddress(false), isCommutable(false),
+ isTwoAddress(false), isCommutable(false), hasImplicitDefs(false),
isPending(false), isAvailable(false), isScheduled(false),
- Latency(0), CycleBound(0), Cycle(0), Depth(0), Height(0),
- NodeNum(nodenum) {}
-
+ CycleBound(0), Cycle(0), Depth(0), Height(0) {}
+
/// addPred - This adds the specified node as a pred of the current node if
/// not already. This returns true if this is a new pred.
- bool addPred(SUnit *N, bool isCtrl, unsigned PhyReg = 0, int Cost = 1) {
+ bool addPred(SUnit *N, bool isCtrl, bool isSpecial,
+ unsigned PhyReg = 0, int Cost = 1) {
for (unsigned i = 0, e = Preds.size(); i != e; ++i)
- if (Preds[i].Dep == N && Preds[i].isCtrl == isCtrl)
+ if (Preds[i].Dep == N &&
+ Preds[i].isCtrl == isCtrl && Preds[i].isSpecial == isSpecial)
return false;
- Preds.push_back(SDep(N, isCtrl, PhyReg, Cost));
+ Preds.push_back(SDep(N, PhyReg, Cost, isCtrl, isSpecial));
+ N->Succs.push_back(SDep(this, PhyReg, Cost, isCtrl, isSpecial));
+ if (isCtrl) {
+ if (!N->isScheduled)
+ ++NumChainPredsLeft;
+ if (!isScheduled)
+ ++N->NumChainSuccsLeft;
+ } else {
+ ++NumPreds;
+ ++N->NumSuccs;
+ if (!N->isScheduled)
+ ++NumPredsLeft;
+ if (!isScheduled)
+ ++N->NumSuccsLeft;
+ }
return true;
}
- /// addSucc - This adds the specified node as a succ of the current node if
- /// not already. This returns true if this is a new succ.
- bool addSucc(SUnit *N, bool isCtrl, unsigned PhyReg = 0, int Cost = 1) {
+ bool removePred(SUnit *N, bool isCtrl, bool isSpecial) {
+ for (SmallVector::iterator I = Preds.begin(), E = Preds.end();
+ I != E; ++I)
+ if (I->Dep == N && I->isCtrl == isCtrl && I->isSpecial == isSpecial) {
+ bool FoundSucc = false;
+ for (SmallVector::iterator II = N->Succs.begin(),
+ EE = N->Succs.end(); II != EE; ++II)
+ if (II->Dep == this &&
+ II->isCtrl == isCtrl && II->isSpecial == isSpecial) {
+ FoundSucc = true;
+ N->Succs.erase(II);
+ break;
+ }
+ assert(FoundSucc && "Mismatching preds / succs lists!");
+ Preds.erase(I);
+ if (isCtrl) {
+ if (!N->isScheduled)
+ --NumChainPredsLeft;
+ if (!isScheduled)
+ --NumChainSuccsLeft;
+ } else {
+ --NumPreds;
+ --N->NumSuccs;
+ if (!N->isScheduled)
+ --NumPredsLeft;
+ if (!isScheduled)
+ --N->NumSuccsLeft;
+ }
+ return true;
+ }
+ return false;
+ }
+
+ bool isPred(SUnit *N) {
+ for (unsigned i = 0, e = Preds.size(); i != e; ++i)
+ if (Preds[i].Dep == N)
+ return true;
+ return false;
+ }
+
+ bool isSucc(SUnit *N) {
for (unsigned i = 0, e = Succs.size(); i != e; ++i)
- if (Succs[i].Dep == N && Succs[i].isCtrl == isCtrl)
- return false;
- Succs.push_back(SDep(N, isCtrl, PhyReg, Cost));
- return true;
+ if (Succs[i].Dep == N)
+ return true;
+ return false;
}
void dump(const SelectionDAG *G) const;
@@ -165,20 +222,27 @@
public:
virtual ~SchedulingPriorityQueue() {}
- virtual void initNodes(DenseMap &SUMap,
+ virtual void initNodes(DenseMap > &SUMap,
std::vector &SUnits) = 0;
+ virtual void addNode(const SUnit *SU) = 0;
+ virtual void updateNode(const SUnit *SU) = 0;
virtual void releaseState() = 0;
-
+
+ virtual unsigned size() const = 0;
virtual bool empty() const = 0;
virtual void push(SUnit *U) = 0;
virtual void push_all(const std::vector &Nodes) = 0;
virtual SUnit *pop() = 0;
+ virtual void remove(SUnit *SU) = 0;
+
/// ScheduledNode - As each node is scheduled, this method is invoked. This
/// allows the priority function to adjust the priority of node that have
/// already been emitted.
virtual void ScheduledNode(SUnit *Node) {}
+
+ virtual void UnscheduledNode(SUnit *Node) {}
};
class ScheduleDAG {
@@ -192,7 +256,8 @@
MachineConstantPool *ConstPool; // Target constant pool
std::vector Sequence; // The schedule. Null SUnit*'s
// represent noop instructions.
- DenseMap SUnitMap; // SDNode to SUnit mapping (n -> 1).
+ DenseMap > SUnitMap;
+ // SDNode to SUnit mapping (n -> n).
std::vector SUnits; // The scheduling units.
SmallSet CommuteSet; // Nodes the should be commuted.
@@ -232,6 +297,10 @@
return &SUnits.back();
}
+ /// Clone - Creates a clone of the specified SUnit. It does not copy the
+ /// predecessors / successors info nor the temporary scheduling states.
+ SUnit *Clone(SUnit *N);
+
/// BuildSchedUnits - Build SUnits from the selection dag that we are input.
/// This SUnit graph is similar to the SelectionDAG, but represents flagged
/// together nodes with a single SUnit.
@@ -256,7 +325,8 @@
/// VRBaseMap contains, for each already emitted node, the first virtual
/// register number for the results of the node.
///
- void EmitNode(SDNode *Node, DenseMap &VRBaseMap);
+ void EmitNode(SDNode *Node, unsigned InstNo,
+ DenseMap &VRBaseMap);
/// EmitNoop - Emit a noop instruction.
///
@@ -264,7 +334,8 @@
/// EmitCopyFromReg - Generate machine code for an CopyFromReg node or an
/// implicit physical register output.
- void EmitCopyFromReg(SDNode *Node, unsigned ResNo, unsigned SrcReg,
+ void EmitCopyFromReg(SDNode *Node, unsigned ResNo, unsigned InstNo,
+ unsigned SrcReg,
DenseMap &VRBaseMap);
void CreateVirtualRegisters(SDNode *Node, MachineInstr *MI,
Modified: llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAG.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAG.cpp?rev=42284&r1=42283&r2=42284&view=diff
==============================================================================
--- llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAG.cpp (original)
+++ llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAG.cpp Mon Sep 24 20:54:36 2007
@@ -27,6 +27,65 @@
#include "llvm/Support/MathExtras.h"
using namespace llvm;
+
+/// getPhysicalRegisterRegClass - Returns the Register Class of a physical
+/// register.
+static const TargetRegisterClass *getPhysicalRegisterRegClass(
+ const MRegisterInfo *MRI,
+ MVT::ValueType VT,
+ unsigned reg) {
+ assert(MRegisterInfo::isPhysicalRegister(reg) &&
+ "reg must be a physical register");
+ // Pick the register class of the right type that contains this physreg.
+ for (MRegisterInfo::regclass_iterator I = MRI->regclass_begin(),
+ E = MRI->regclass_end(); I != E; ++I)
+ if ((*I)->hasType(VT) && (*I)->contains(reg))
+ return *I;
+ assert(false && "Couldn't find the register class");
+ return 0;
+}
+
+
+/// CheckForPhysRegDependency - Check if the dependency between def and use of
+/// a specified operand is a physical register dependency. If so, returns the
+/// register and the cost of copying the register.
+static void CheckForPhysRegDependency(SDNode *Def, SDNode *Use, unsigned Op,
+ const MRegisterInfo *MRI,
+ const TargetInstrInfo *TII,
+ unsigned &PhysReg, int &Cost) {
+ if (Op != 2 || Use->getOpcode() != ISD::CopyToReg)
+ return;
+
+ unsigned Reg = cast(Use->getOperand(1))->getReg();
+ if (MRegisterInfo::isVirtualRegister(Reg))
+ return;
+
+ unsigned ResNo = Use->getOperand(2).ResNo;
+ if (Def->isTargetOpcode()) {
+ const TargetInstrDescriptor &II = TII->get(Def->getTargetOpcode());
+ if (ResNo >= II.numDefs &&
+ II.ImplicitDefs[ResNo - II.numDefs] == Reg) {
+ PhysReg = Reg;
+ const TargetRegisterClass *RC =
+ getPhysicalRegisterRegClass(MRI, Def->getValueType(ResNo), Reg);
+ Cost = RC->getCopyCost();
+ }
+ }
+}
+
+SUnit *ScheduleDAG::Clone(SUnit *Old) {
+ SUnit *SU = NewSUnit(Old->Node);
+ for (unsigned i = 0, e = SU->FlaggedNodes.size(); i != e; ++i)
+ SU->FlaggedNodes.push_back(SU->FlaggedNodes[i]);
+ SU->InstanceNo = SUnitMap[Old->Node].size();
+ SU->Latency = Old->Latency;
+ SU->isTwoAddress = Old->isTwoAddress;
+ SU->isCommutable = Old->isCommutable;
+ SU->hasImplicitDefs = Old->hasImplicitDefs;
+ SUnitMap[Old->Node].push_back(SU);
+ return SU;
+}
+
/// BuildSchedUnits - Build SUnits from the selection dag that we are input.
/// This SUnit graph is similar to the SelectionDAG, but represents flagged
/// together nodes with a single SUnit.
@@ -44,7 +103,7 @@
continue;
// If this node has already been processed, stop now.
- if (SUnitMap[NI]) continue;
+ if (SUnitMap[NI].size()) continue;
SUnit *NodeSUnit = NewSUnit(NI);
@@ -59,7 +118,7 @@
do {
N = N->getOperand(N->getNumOperands()-1).Val;
NodeSUnit->FlaggedNodes.push_back(N);
- SUnitMap[N] = NodeSUnit;
+ SUnitMap[N].push_back(NodeSUnit);
} while (N->getNumOperands() &&
N->getOperand(N->getNumOperands()-1).getValueType()== MVT::Flag);
std::reverse(NodeSUnit->FlaggedNodes.begin(),
@@ -79,7 +138,7 @@
if (FlagVal.isOperand(*UI)) {
HasFlagUse = true;
NodeSUnit->FlaggedNodes.push_back(N);
- SUnitMap[N] = NodeSUnit;
+ SUnitMap[N].push_back(NodeSUnit);
N = *UI;
break;
}
@@ -89,7 +148,7 @@
// Now all flagged nodes are in FlaggedNodes and N is the bottom-most node.
// Update the SUnit
NodeSUnit->Node = N;
- SUnitMap[N] = NodeSUnit;
+ SUnitMap[N].push_back(NodeSUnit);
// Compute the latency for the node. We use the sum of the latencies for
// all nodes flagged together into this SUnit.
@@ -125,13 +184,16 @@
if (MainNode->isTargetOpcode()) {
unsigned Opc = MainNode->getTargetOpcode();
- for (unsigned i = 0, ee = TII->getNumOperands(Opc); i != ee; ++i) {
- if (TII->getOperandConstraint(Opc, i, TOI::TIED_TO) != -1) {
+ const TargetInstrDescriptor &TID = TII->get(Opc);
+ if (TID.ImplicitDefs)
+ SU->hasImplicitDefs = true;
+ for (unsigned i = 0; i != TID.numOperands; ++i) {
+ if (TID.getOperandConstraint(i, TOI::TIED_TO) != -1) {
SU->isTwoAddress = true;
break;
}
}
- if (TII->isCommutableInstr(Opc))
+ if (TID.Flags & M_COMMUTABLE)
SU->isCommutable = true;
}
@@ -141,34 +203,25 @@
for (unsigned n = 0, e = SU->FlaggedNodes.size(); n != e; ++n) {
SDNode *N = SU->FlaggedNodes[n];
+ if (N->isTargetOpcode() && TII->getImplicitDefs(N->getTargetOpcode()))
+ SU->hasImplicitDefs = true;
for (unsigned i = 0, e = N->getNumOperands(); i != e; ++i) {
SDNode *OpN = N->getOperand(i).Val;
if (isPassiveNode(OpN)) continue; // Not scheduled.
- SUnit *OpSU = SUnitMap[OpN];
+ SUnit *OpSU = SUnitMap[OpN].front();
assert(OpSU && "Node has no SUnit!");
if (OpSU == SU) continue; // In the same group.
MVT::ValueType OpVT = N->getOperand(i).getValueType();
assert(OpVT != MVT::Flag && "Flagged nodes should be in same sunit!");
bool isChain = OpVT == MVT::Other;
-
- if (SU->addPred(OpSU, isChain)) {
- if (!isChain) {
- SU->NumPreds++;
- SU->NumPredsLeft++;
- } else {
- SU->NumChainPredsLeft++;
- }
- }
- if (OpSU->addSucc(SU, isChain)) {
- if (!isChain) {
- OpSU->NumSuccs++;
- OpSU->NumSuccsLeft++;
- } else {
- OpSU->NumChainSuccsLeft++;
- }
- }
+
+ unsigned PhysReg = 0;
+ int Cost = 1;
+ // Determine if this is a physical register dependency.
+ CheckForPhysRegDependency(OpN, N, i, MRI, TII, PhysReg, Cost);
+ SU->addPred(OpSU, isChain, false, PhysReg, Cost);
}
}
@@ -200,7 +253,7 @@
void ScheduleDAG::CalculateHeights() {
std::vector > WorkList;
- SUnit *Root = SUnitMap[DAG.getRoot().Val];
+ SUnit *Root = SUnitMap[DAG.getRoot().Val].front();
WorkList.push_back(std::make_pair(Root, 0U));
while (!WorkList.empty()) {
@@ -254,27 +307,14 @@
? TII->getPointerRegClass() : MRI->getRegClass(toi.RegClass);
}
-// Returns the Register Class of a physical register
-static const TargetRegisterClass *getPhysicalRegisterRegClass(
- const MRegisterInfo *MRI,
- MVT::ValueType VT,
- unsigned reg) {
- assert(MRegisterInfo::isPhysicalRegister(reg) &&
- "reg must be a physical register");
- // Pick the register class of the right type that contains this physreg.
- for (MRegisterInfo::regclass_iterator I = MRI->regclass_begin(),
- E = MRI->regclass_end(); I != E; ++I)
- if ((*I)->hasType(VT) && (*I)->contains(reg))
- return *I;
- assert(false && "Couldn't find the register class");
- return 0;
-}
-
-void ScheduleDAG::EmitCopyFromReg(SDNode *Node, unsigned ResNo, unsigned SrcReg,
+void ScheduleDAG::EmitCopyFromReg(SDNode *Node, unsigned ResNo,
+ unsigned InstanceNo, unsigned SrcReg,
DenseMap &VRBaseMap) {
unsigned VRBase = 0;
if (MRegisterInfo::isVirtualRegister(SrcReg)) {
// Just use the input register directly!
+ if (InstanceNo > 0)
+ VRBaseMap.erase(SDOperand(Node, ResNo));
bool isNew = VRBaseMap.insert(std::make_pair(SDOperand(Node,ResNo),SrcReg));
assert(isNew && "Node emitted out of order - early");
return;
@@ -282,32 +322,54 @@
// If the node is only used by a CopyToReg and the dest reg is a vreg, use
// the CopyToReg'd destination register instead of creating a new vreg.
+ bool MatchReg = true;
for (SDNode::use_iterator UI = Node->use_begin(), E = Node->use_end();
UI != E; ++UI) {
SDNode *Use = *UI;
+ bool Match = true;
if (Use->getOpcode() == ISD::CopyToReg &&
Use->getOperand(2).Val == Node &&
Use->getOperand(2).ResNo == ResNo) {
unsigned DestReg = cast(Use->getOperand(1))->getReg();
if (MRegisterInfo::isVirtualRegister(DestReg)) {
VRBase = DestReg;
- break;
+ Match = false;
+ } else if (DestReg != SrcReg)
+ Match = false;
+ } else {
+ for (unsigned i = 0, e = Use->getNumOperands(); i != e; ++i) {
+ SDOperand Op = Use->getOperand(i);
+ if (Op.Val != Node)
+ continue;
+ MVT::ValueType VT = Node->getValueType(Op.ResNo);
+ if (VT != MVT::Other && VT != MVT::Flag)
+ Match = false;
}
}
+ MatchReg &= Match;
+ if (VRBase)
+ break;
}
- // Figure out the register class to create for the destreg.
const TargetRegisterClass *TRC = 0;
- if (VRBase) {
+ // Figure out the register class to create for the destreg.
+ if (VRBase)
TRC = RegMap->getRegClass(VRBase);
- } else {
+ else
TRC = getPhysicalRegisterRegClass(MRI, Node->getValueType(ResNo), SrcReg);
-
+
+ // If all uses are reading from the src physical register and copying the
+ // register is either impossible or very expensive, then don't create a copy.
+ if (MatchReg && TRC->getCopyCost() < 0) {
+ VRBase = SrcReg;
+ } else {
// Create the reg, emit the copy.
VRBase = RegMap->createVirtualRegister(TRC);
+ MRI->copyRegToReg(*BB, BB->end(), VRBase, SrcReg, TRC);
}
- MRI->copyRegToReg(*BB, BB->end(), VRBase, SrcReg, TRC);
+ if (InstanceNo > 0)
+ VRBaseMap.erase(SDOperand(Node, ResNo));
bool isNew = VRBaseMap.insert(std::make_pair(SDOperand(Node,ResNo), VRBase));
assert(isNew && "Node emitted out of order - early");
}
@@ -611,7 +673,7 @@
/// EmitNode - Generate machine code for an node and needed dependencies.
///
-void ScheduleDAG::EmitNode(SDNode *Node,
+void ScheduleDAG::EmitNode(SDNode *Node, unsigned InstanceNo,
DenseMap &VRBaseMap) {
// If machine instruction
if (Node->isTargetOpcode()) {
@@ -677,7 +739,7 @@
for (unsigned i = II.numDefs; i < NumResults; ++i) {
unsigned Reg = II.ImplicitDefs[i - II.numDefs];
if (Node->hasAnyUseOfValue(i))
- EmitCopyFromReg(Node, i, Reg, VRBaseMap);
+ EmitCopyFromReg(Node, i, InstanceNo, Reg, VRBaseMap);
}
}
} else {
@@ -713,7 +775,7 @@
}
case ISD::CopyFromReg: {
unsigned SrcReg = cast(Node->getOperand(1))->getReg();
- EmitCopyFromReg(Node, 0, SrcReg, VRBaseMap);
+ EmitCopyFromReg(Node, 0, InstanceNo, SrcReg, VRBaseMap);
break;
}
case ISD::INLINEASM: {
@@ -802,9 +864,9 @@
DenseMap VRBaseMap;
for (unsigned i = 0, e = Sequence.size(); i != e; i++) {
if (SUnit *SU = Sequence[i]) {
- for (unsigned j = 0, ee = SU->FlaggedNodes.size(); j != ee; j++)
- EmitNode(SU->FlaggedNodes[j], VRBaseMap);
- EmitNode(SU->Node, VRBaseMap);
+ for (unsigned j = 0, ee = SU->FlaggedNodes.size(); j != ee; ++j)
+ EmitNode(SU->FlaggedNodes[j], SU->InstanceNo, VRBaseMap);
+ EmitNode(SU->Node, SU->InstanceNo, VRBaseMap);
} else {
// Null SUnit* is a noop.
EmitNoop();
@@ -869,7 +931,10 @@
cerr << " ch #";
else
cerr << " val #";
- cerr << I->Dep << " - SU(" << I->Dep->NodeNum << ")\n";
+ cerr << I->Dep << " - SU(" << I->Dep->NodeNum << ")";
+ if (I->isSpecial)
+ cerr << " *";
+ cerr << "\n";
}
}
if (Succs.size() != 0) {
@@ -880,7 +945,10 @@
cerr << " ch #";
else
cerr << " val #";
- cerr << I->Dep << " - SU(" << I->Dep->NodeNum << ")\n";
+ cerr << I->Dep << " - SU(" << I->Dep->NodeNum << ")";
+ if (I->isSpecial)
+ cerr << " *";
+ cerr << "\n";
}
}
cerr << "\n";
Modified: llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGList.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGList.cpp?rev=42284&r1=42283&r2=42284&view=diff
==============================================================================
--- llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGList.cpp (original)
+++ llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGList.cpp Mon Sep 24 20:54:36 2007
@@ -168,7 +168,7 @@
/// schedulers.
void ScheduleDAGList::ListScheduleTopDown() {
unsigned CurCycle = 0;
- SUnit *Entry = SUnitMap[DAG.getEntryNode().Val];
+ SUnit *Entry = SUnitMap[DAG.getEntryNode().Val].front();
// All leaves to Available queue.
for (unsigned i = 0, e = SUnits.size(); i != e; ++i) {
@@ -328,12 +328,24 @@
LatencyPriorityQueue() : Queue(latency_sort(this)) {
}
- void initNodes(DenseMap &sumap,
+ void initNodes(DenseMap > &sumap,
std::vector &sunits) {
SUnits = &sunits;
// Calculate node priorities.
CalculatePriorities();
}
+
+ void addNode(const SUnit *SU) {
+ Latencies.resize(SUnits->size(), -1);
+ NumNodesSolelyBlocking.resize(SUnits->size(), 0);
+ CalcLatency(*SU);
+ }
+
+ void updateNode(const SUnit *SU) {
+ Latencies[SU->NodeNum] = -1;
+ CalcLatency(*SU);
+ }
+
void releaseState() {
SUnits = 0;
Latencies.clear();
@@ -349,6 +361,8 @@
return NumNodesSolelyBlocking[NodeNum];
}
+ unsigned size() const { return Queue.size(); }
+
bool empty() const { return Queue.empty(); }
virtual void push(SUnit *U) {
@@ -368,22 +382,10 @@
return V;
}
- // ScheduledNode - As nodes are scheduled, we look to see if there are any
- // successor nodes that have a single unscheduled predecessor. If so, that
- // single predecessor has a higher priority, since scheduling it will make
- // the node available.
- void ScheduledNode(SUnit *Node);
-
-private:
- void CalculatePriorities();
- int CalcLatency(const SUnit &SU);
- void AdjustPriorityOfUnscheduledPreds(SUnit *SU);
- SUnit *getSingleUnscheduledPred(SUnit *SU);
-
- /// RemoveFromPriorityQueue - This is a really inefficient way to remove a
- /// node from a priority queue. We should roll our own heap to make this
- /// better or something.
- void RemoveFromPriorityQueue(SUnit *SU) {
+ /// remove - This is a really inefficient way to remove a node from a
+ /// priority queue. We should roll our own heap to make this better or
+ /// something.
+ void remove(SUnit *SU) {
std::vector Temp;
assert(!Queue.empty() && "Not in queue!");
@@ -400,6 +402,18 @@
for (unsigned i = 0, e = Temp.size(); i != e; ++i)
Queue.push(Temp[i]);
}
+
+ // ScheduledNode - As nodes are scheduled, we look to see if there are any
+ // successor nodes that have a single unscheduled predecessor. If so, that
+ // single predecessor has a higher priority, since scheduling it will make
+ // the node available.
+ void ScheduledNode(SUnit *Node);
+
+private:
+ void CalculatePriorities();
+ int CalcLatency(const SUnit &SU);
+ void AdjustPriorityOfUnscheduledPreds(SUnit *SU);
+ SUnit *getSingleUnscheduledPred(SUnit *SU);
};
}
@@ -507,7 +521,7 @@
// Okay, we found a single predecessor that is available, but not scheduled.
// Since it is available, it must be in the priority queue. First remove it.
- RemoveFromPriorityQueue(OnlyAvailablePred);
+ remove(OnlyAvailablePred);
// Reinsert the node into the priority queue, which recomputes its
// NumNodesSolelyBlocking value.
Modified: llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp?rev=42284&r1=42283&r2=42284&view=diff
==============================================================================
--- llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp (original)
+++ llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp Mon Sep 24 20:54:36 2007
@@ -25,6 +25,7 @@
#include "llvm/Target/TargetInstrInfo.h"
#include "llvm/Support/Debug.h"
#include "llvm/Support/Compiler.h"
+#include "llvm/ADT/SmallSet.h"
#include "llvm/ADT/Statistic.h"
#include
#include
@@ -52,9 +53,16 @@
bool isBottomUp;
/// AvailableQueue - The priority queue to use for the available SUnits.
- ///
+ ///a
SchedulingPriorityQueue *AvailableQueue;
+ /// LiveRegs / LiveRegDefs - A set of physical registers and their definition
+ /// that are "live". These nodes must be scheduled before any other nodes that
+ /// modifies the registers can be scheduled.
+ SmallSet LiveRegs;
+ std::vector LiveRegDefs;
+ std::vector LiveRegCycles;
+
public:
ScheduleDAGRRList(SelectionDAG &dag, MachineBasicBlock *bb,
const TargetMachine &tm, bool isbottomup,
@@ -72,8 +80,13 @@
private:
void ReleasePred(SUnit *PredSU, bool isChain, unsigned CurCycle);
void ReleaseSucc(SUnit *SuccSU, bool isChain, unsigned CurCycle);
+ void CapturePred(SUnit *PredSU, SUnit *SU, bool isChain);
void ScheduleNodeBottomUp(SUnit *SU, unsigned CurCycle);
void ScheduleNodeTopDown(SUnit *SU, unsigned CurCycle);
+ void UnscheduleNodeBottomUp(SUnit *SU);
+ SUnit *BackTrackBottomUp(SUnit*, unsigned, unsigned&, bool&);
+ SUnit *CopyAndMoveSuccessors(SUnit *SU);
+ bool DelayForLiveRegsBottomUp(SUnit *SU, unsigned &CurCycle);
void ListScheduleTopDown();
void ListScheduleBottomUp();
void CommuteNodesToReducePressure();
@@ -84,7 +97,10 @@
/// Schedule - Schedule the DAG using list scheduling.
void ScheduleDAGRRList::Schedule() {
DOUT << "********** List Scheduling **********\n";
-
+
+ LiveRegDefs.resize(MRI->getNumRegs(), NULL);
+ LiveRegCycles.resize(MRI->getNumRegs(), 0);
+
// Build scheduling units.
BuildSchedUnits();
@@ -130,7 +146,7 @@
continue;
SDNode *OpN = SU->Node->getOperand(j).Val;
- SUnit *OpSU = SUnitMap[OpN];
+ SUnit *OpSU = SUnitMap[OpN][SU->InstanceNo];
if (OpSU && OperandSeen.count(OpSU) == 1) {
// Ok, so SU is not the last use of OpSU, but SU is two-address so
// it will clobber OpSU. Try to commute SU if no other source operands
@@ -139,7 +155,7 @@
for (unsigned k = 0; k < NumOps; ++k) {
if (k != j) {
OpN = SU->Node->getOperand(k).Val;
- OpSU = SUnitMap[OpN];
+ OpSU = SUnitMap[OpN][SU->InstanceNo];
if (OpSU && OperandSeen.count(OpSU) == 1) {
DoCommute = false;
break;
@@ -178,9 +194,9 @@
PredSU->CycleBound = std::max(PredSU->CycleBound, CurCycle + PredSU->Latency);
if (!isChain)
- PredSU->NumSuccsLeft--;
+ --PredSU->NumSuccsLeft;
else
- PredSU->NumChainSuccsLeft--;
+ --PredSU->NumChainSuccsLeft;
#ifndef NDEBUG
if (PredSU->NumSuccsLeft < 0 || PredSU->NumChainSuccsLeft < 0) {
@@ -209,19 +225,273 @@
SU->Cycle = CurCycle;
AvailableQueue->ScheduledNode(SU);
- Sequence.push_back(SU);
// Bottom up: release predecessors
for (SUnit::pred_iterator I = SU->Preds.begin(), E = SU->Preds.end();
- I != E; ++I)
+ I != E; ++I) {
ReleasePred(I->Dep, I->isCtrl, CurCycle);
+ if (I->Cost < 0) {
+ // This is a physical register dependency and it's impossible or
+ // expensive to copy the register. Make sure nothing that can
+ // clobber the register is scheduled between the predecessor and
+ // this node.
+ if (LiveRegs.insert(I->Reg)) {
+ LiveRegDefs[I->Reg] = I->Dep;
+ LiveRegCycles[I->Reg] = CurCycle;
+ }
+ }
+ }
+
+ // Release all the implicit physical register defs that are live.
+ for (SUnit::succ_iterator I = SU->Succs.begin(), E = SU->Succs.end();
+ I != E; ++I) {
+ if (I->Cost < 0) {
+ if (LiveRegCycles[I->Reg] == I->Dep->Cycle) {
+ LiveRegs.erase(I->Reg);
+ assert(LiveRegDefs[I->Reg] == SU &&
+ "Physical register dependency violated?");
+ LiveRegDefs[I->Reg] = NULL;
+ LiveRegCycles[I->Reg] = 0;
+ }
+ }
+ }
+
SU->isScheduled = true;
}
-/// isReady - True if node's lower cycle bound is less or equal to the current
-/// scheduling cycle. Always true if all nodes have uniform latency 1.
-static inline bool isReady(SUnit *SU, unsigned CurCycle) {
- return SU->CycleBound <= CurCycle;
+/// CapturePred - This does the opposite of ReleasePred. Since SU is being
+/// unscheduled, incrcease the succ left count of its predecessors. Remove
+/// them from AvailableQueue if necessary.
+void ScheduleDAGRRList::CapturePred(SUnit *PredSU, SUnit *SU, bool isChain) {
+ PredSU->CycleBound = 0;
+ for (SUnit::succ_iterator I = PredSU->Succs.begin(), E = PredSU->Succs.end();
+ I != E; ++I) {
+ if (I->Dep == SU)
+ continue;
+ PredSU->CycleBound = std::max(PredSU->CycleBound,
+ I->Dep->Cycle + PredSU->Latency);
+ }
+
+ if (PredSU->isAvailable) {
+ PredSU->isAvailable = false;
+ if (!PredSU->isPending)
+ AvailableQueue->remove(PredSU);
+ }
+
+ if (!isChain)
+ ++PredSU->NumSuccsLeft;
+ else
+ ++PredSU->NumChainSuccsLeft;
+}
+
+/// UnscheduleNodeBottomUp - Remove the node from the schedule, update its and
+/// its predecessor states to reflect the change.
+void ScheduleDAGRRList::UnscheduleNodeBottomUp(SUnit *SU) {
+ DOUT << "*** Unscheduling [" << SU->Cycle << "]: ";
+ DEBUG(SU->dump(&DAG));
+
+ AvailableQueue->UnscheduledNode(SU);
+
+ for (SUnit::pred_iterator I = SU->Preds.begin(), E = SU->Preds.end();
+ I != E; ++I) {
+ CapturePred(I->Dep, SU, I->isCtrl);
+ if (I->Cost < 0 && SU->Cycle == LiveRegCycles[I->Reg]) {
+ LiveRegs.erase(I->Reg);
+ assert(LiveRegDefs[I->Reg] == I->Dep &&
+ "Physical register dependency violated?");
+ LiveRegDefs[I->Reg] = NULL;
+ LiveRegCycles[I->Reg] = 0;
+ }
+ }
+
+ for (SUnit::succ_iterator I = SU->Succs.begin(), E = SU->Succs.end();
+ I != E; ++I) {
+ if (I->Cost < 0) {
+ if (LiveRegs.insert(I->Reg)) {
+ assert(!LiveRegDefs[I->Reg] &&
+ "Physical register dependency violated?");
+ LiveRegDefs[I->Reg] = SU;
+ }
+ if (I->Dep->Cycle < LiveRegCycles[I->Reg])
+ LiveRegCycles[I->Reg] = I->Dep->Cycle;
+ }
+ }
+
+ SU->Cycle = 0;
+ SU->isScheduled = false;
+ SU->isAvailable = true;
+ AvailableQueue->push(SU);
+}
+
+/// BackTrackBottomUp - Back track scheduling to a previous cycle specified in
+/// BTCycle in order to schedule a specific node. Returns the last unscheduled
+/// SUnit. Also returns if a successor is unscheduled in the process.
+SUnit *ScheduleDAGRRList::BackTrackBottomUp(SUnit *SU, unsigned BTCycle,
+ unsigned &CurCycle, bool &SuccUnsched) {
+ SuccUnsched = false;
+ SUnit *OldSU = NULL;
+ while (CurCycle > BTCycle) {
+ OldSU = Sequence.back();
+ Sequence.pop_back();
+ if (SU->isSucc(OldSU))
+ SuccUnsched = true;
+ UnscheduleNodeBottomUp(OldSU);
+ --CurCycle;
+ }
+
+
+ if (SU->isSucc(OldSU)) {
+ assert(false && "Something is wrong!");
+ abort();
+ }
+
+ return OldSU;
+}
+
+/// isSafeToCopy - True if the SUnit for the given SDNode can safely cloned,
+/// i.e. the node does not produce a flag, it does not read a flag and it does
+/// not have an incoming chain.
+static bool isSafeToCopy(SDNode *N) {
+ for (unsigned i = 0, e = N->getNumValues(); i != e; ++i)
+ if (N->getValueType(i) == MVT::Flag)
+ return false;
+ for (unsigned i = 0, e = N->getNumOperands(); i != e; ++i) {
+ const SDOperand &Op = N->getOperand(i);
+ MVT::ValueType VT = Op.Val->getValueType(Op.ResNo);
+ if (VT == MVT::Other || VT == MVT::Flag)
+ return false;
+ }
+
+ return true;
+}
+
+/// CopyAndMoveSuccessors - Clone the specified node and move its scheduled
+/// successors to the newly created node.
+SUnit *ScheduleDAGRRList::CopyAndMoveSuccessors(SUnit *SU) {
+ SUnit *NewSU = Clone(SU);
+
+ // New SUnit has the exact same predecessors.
+ for (SUnit::pred_iterator I = SU->Preds.begin(), E = SU->Preds.end();
+ I != E; ++I)
+ if (!I->isSpecial) {
+ NewSU->addPred(I->Dep, I->isCtrl, false, I->Reg, I->Cost);
+ NewSU->Depth = std::max(NewSU->Depth, I->Dep->Depth+1);
+ }
+
+ // Only copy scheduled successors. Cut them from old node's successor
+ // list and move them over.
+ SmallVector DelDeps;
+ for (SUnit::succ_iterator I = SU->Succs.begin(), E = SU->Succs.end();
+ I != E; ++I) {
+ if (I->isSpecial)
+ continue;
+ NewSU->Height = std::max(NewSU->Height, I->Dep->Height+1);
+ if (I->Dep->isScheduled) {
+ I->Dep->addPred(NewSU, I->isCtrl, false, I->Reg, I->Cost);
+ DelDeps.push_back(I);
+ }
+ }
+ for (unsigned i = 0, e = DelDeps.size(); i != e; ++i) {
+ SUnit *Succ = DelDeps[i]->Dep;
+ bool isCtrl = DelDeps[i]->isCtrl;
+ Succ->removePred(SU, isCtrl, false);
+ }
+
+ AvailableQueue->updateNode(SU);
+ AvailableQueue->addNode(NewSU);
+
+ return NewSU;
+}
+
+/// DelayForLiveRegsBottomUp - Returns true if it is necessary to delay
+/// scheduling of the given node to satisfy live physical register dependencies.
+/// If the specific node is the last one that's available to schedule, do
+/// whatever is necessary (i.e. backtracking or cloning) to make it possible.
+bool ScheduleDAGRRList::DelayForLiveRegsBottomUp(SUnit *SU, unsigned &CurCycle){
+ if (LiveRegs.empty())
+ return false;
+
+ // If this node would clobber any "live" register, then it's not ready.
+ // However, if this is the last "available" node, then we may have to
+ // backtrack.
+ bool MustSched = AvailableQueue->empty();
+ SmallVector LRegs;
+ for (SUnit::pred_iterator I = SU->Preds.begin(), E = SU->Preds.end();
+ I != E; ++I) {
+ if (I->Cost < 0) {
+ unsigned Reg = I->Reg;
+ if (LiveRegs.count(Reg) && LiveRegDefs[Reg] != I->Dep)
+ LRegs.push_back(Reg);
+ for (const unsigned *Alias = MRI->getAliasSet(Reg);
+ *Alias; ++Alias)
+ if (LiveRegs.count(*Alias) && LiveRegDefs[*Alias] != I->Dep)
+ LRegs.push_back(*Alias);
+ }
+ }
+
+ for (unsigned i = 0, e = SU->FlaggedNodes.size()+1; i != e; ++i) {
+ SDNode *Node = (i == 0) ? SU->Node : SU->FlaggedNodes[i-1];
+ if (!Node->isTargetOpcode())
+ continue;
+ const TargetInstrDescriptor &TID = TII->get(Node->getTargetOpcode());
+ if (!TID.ImplicitDefs)
+ continue;
+ for (const unsigned *Reg = TID.ImplicitDefs; *Reg; ++Reg) {
+ if (LiveRegs.count(*Reg) && LiveRegDefs[*Reg] != SU)
+ LRegs.push_back(*Reg);
+ for (const unsigned *Alias = MRI->getAliasSet(*Reg);
+ *Alias; ++Alias)
+ if (LiveRegs.count(*Alias) && LiveRegDefs[*Alias] != SU)
+ LRegs.push_back(*Alias);
+ }
+ }
+
+ if (MustSched && !LRegs.empty()) {
+ // We have made a mistake by scheduling some nodes too early. Now we must
+ // schedule the current node which will end up clobbering some live
+ // registers that are expensive / impossible to copy. Try unscheduling
+ // up to the point where it's safe to schedule the current node.
+ unsigned LiveCycle = CurCycle;
+ for (unsigned i = 0, e = LRegs.size(); i != e; ++i) {
+ unsigned Reg = LRegs[i];
+ unsigned LCycle = LiveRegCycles[Reg];
+ LiveCycle = std::min(LiveCycle, LCycle);
+ }
+
+ if (SU->CycleBound < LiveCycle) {
+ bool SuccUnsched = false;
+ SUnit *OldSU = BackTrackBottomUp(SU, LiveCycle, CurCycle, SuccUnsched);
+ // Force the current node to be scheduled before the node that
+ // requires the physical reg dep.
+ if (OldSU->isAvailable) {
+ OldSU->isAvailable = false;
+ AvailableQueue->remove(OldSU);
+ }
+ SU->addPred(OldSU, true, true);
+ // If a successor has been unscheduled, then it's not possible to
+ // schedule the current node.
+ return SuccUnsched;
+ } else {
+ // Try duplicating the nodes that produces these "expensive to copy"
+ // values to break the dependency.
+ for (unsigned i = 0, e = LRegs.size(); i != e; ++i) {
+ unsigned Reg = LRegs[i];
+ SUnit *LRDef = LiveRegDefs[Reg];
+ if (isSafeToCopy(LRDef->Node)) {
+ SUnit *NewDef = CopyAndMoveSuccessors(LRDef);
+ LiveRegDefs[Reg] = NewDef;
+ NewDef->addPred(SU, true, true);
+ SU->isAvailable = false;
+ AvailableQueue->push(NewDef);
+ } else {
+ assert(false && "Expensive copying is required?");
+ abort();
+ }
+ }
+ return true;
+ }
+ }
+ return !LRegs.empty();
}
/// ListScheduleBottomUp - The main loop of list scheduling for bottom-up
@@ -229,30 +499,49 @@
void ScheduleDAGRRList::ListScheduleBottomUp() {
unsigned CurCycle = 0;
// Add root to Available queue.
- AvailableQueue->push(SUnitMap[DAG.getRoot().Val]);
+ SUnit *RootSU = SUnitMap[DAG.getRoot().Val].front();
+ RootSU->isAvailable = true;
+ AvailableQueue->push(RootSU);
// While Available queue is not empty, grab the node with the highest
// priority. If it is not ready put it back. Schedule the node.
- std::vector NotReady;
+ SmallVector NotReady;
while (!AvailableQueue->empty()) {
- SUnit *CurNode = AvailableQueue->pop();
- while (CurNode && !isReady(CurNode, CurCycle)) {
- NotReady.push_back(CurNode);
- CurNode = AvailableQueue->pop();
+ SUnit *CurSU = AvailableQueue->pop();
+ while (CurSU) {
+ if (CurSU->CycleBound <= CurCycle)
+ if (!DelayForLiveRegsBottomUp(CurSU, CurCycle))
+ break;
+
+ // Verify node is still ready. It may not be in case the
+ // scheduler has backtracked.
+ if (CurSU->isAvailable) {
+ CurSU->isPending = true;
+ NotReady.push_back(CurSU);
+ }
+ CurSU = AvailableQueue->pop();
}
// Add the nodes that aren't ready back onto the available list.
- AvailableQueue->push_all(NotReady);
+ for (unsigned i = 0, e = NotReady.size(); i != e; ++i) {
+ NotReady[i]->isPending = false;
+ if (NotReady[i]->isAvailable)
+ AvailableQueue->push(NotReady[i]);
+ }
NotReady.clear();
- if (CurNode != NULL)
- ScheduleNodeBottomUp(CurNode, CurCycle);
- CurCycle++;
+ if (!CurSU)
+ Sequence.push_back(0);
+ else {
+ ScheduleNodeBottomUp(CurSU, CurCycle);
+ Sequence.push_back(CurSU);
+ }
+ ++CurCycle;
}
// Add entry node last
if (DAG.getEntryNode().Val != DAG.getRoot().Val) {
- SUnit *Entry = SUnitMap[DAG.getEntryNode().Val];
+ SUnit *Entry = SUnitMap[DAG.getEntryNode().Val].front();
Sequence.push_back(Entry);
}
@@ -291,9 +580,9 @@
SuccSU->CycleBound = std::max(SuccSU->CycleBound, CurCycle + SuccSU->Latency);
if (!isChain)
- SuccSU->NumPredsLeft--;
+ --SuccSU->NumPredsLeft;
else
- SuccSU->NumChainPredsLeft--;
+ --SuccSU->NumChainPredsLeft;
#ifndef NDEBUG
if (SuccSU->NumPredsLeft < 0 || SuccSU->NumChainPredsLeft < 0) {
@@ -320,7 +609,6 @@
SU->Cycle = CurCycle;
AvailableQueue->ScheduledNode(SU);
- Sequence.push_back(SU);
// Top down: release successors
for (SUnit::succ_iterator I = SU->Succs.begin(), E = SU->Succs.end();
@@ -333,7 +621,7 @@
/// schedulers.
void ScheduleDAGRRList::ListScheduleTopDown() {
unsigned CurCycle = 0;
- SUnit *Entry = SUnitMap[DAG.getEntryNode().Val];
+ SUnit *Entry = SUnitMap[DAG.getEntryNode().Val].front();
// All leaves to Available queue.
for (unsigned i = 0, e = SUnits.size(); i != e; ++i) {
@@ -346,24 +634,29 @@
// Emit the entry node first.
ScheduleNodeTopDown(Entry, CurCycle);
- CurCycle++;
+ Sequence.push_back(Entry);
+ ++CurCycle;
// While Available queue is not empty, grab the node with the highest
// priority. If it is not ready put it back. Schedule the node.
std::vector NotReady;
while (!AvailableQueue->empty()) {
- SUnit *CurNode = AvailableQueue->pop();
- while (CurNode && !isReady(CurNode, CurCycle)) {
- NotReady.push_back(CurNode);
- CurNode = AvailableQueue->pop();
+ SUnit *CurSU = AvailableQueue->pop();
+ while (CurSU && CurSU->CycleBound > CurCycle) {
+ NotReady.push_back(CurSU);
+ CurSU = AvailableQueue->pop();
}
// Add the nodes that aren't ready back onto the available list.
AvailableQueue->push_all(NotReady);
NotReady.clear();
- if (CurNode != NULL)
- ScheduleNodeTopDown(CurNode, CurCycle);
+ if (!CurSU)
+ Sequence.push_back(0);
+ else {
+ ScheduleNodeTopDown(CurSU, CurCycle);
+ Sequence.push_back(CurSU);
+ }
CurCycle++;
}
@@ -431,14 +724,21 @@
RegReductionPriorityQueue() :
Queue(SF(this)) {}
- virtual void initNodes(DenseMap &sumap,
+ virtual void initNodes(DenseMap > &sumap,
std::vector &sunits) {}
+
+ virtual void addNode(const SUnit *SU) {}
+
+ virtual void updateNode(const SUnit *SU) {}
+
virtual void releaseState() {}
virtual unsigned getNodePriority(const SUnit *SU) const {
return 0;
}
+ unsigned size() const { return Queue.size(); }
+
bool empty() const { return Queue.empty(); }
void push(SUnit *U) {
@@ -456,16 +756,33 @@
return V;
}
- virtual bool isDUOperand(const SUnit *SU1, const SUnit *SU2) {
- return false;
+ /// remove - This is a really inefficient way to remove a node from a
+ /// priority queue. We should roll our own heap to make this better or
+ /// something.
+ void remove(SUnit *SU) {
+ std::vector Temp;
+
+ assert(!Queue.empty() && "Not in queue!");
+ while (Queue.top() != SU) {
+ Temp.push_back(Queue.top());
+ Queue.pop();
+ assert(!Queue.empty() && "Not in queue!");
+ }
+
+ // Remove the node from the PQ.
+ Queue.pop();
+
+ // Add all the other nodes back.
+ for (unsigned i = 0, e = Temp.size(); i != e; ++i)
+ Queue.push(Temp[i]);
}
};
template
class VISIBILITY_HIDDEN BURegReductionPriorityQueue
: public RegReductionPriorityQueue {
- // SUnitMap SDNode to SUnit mapping (n -> 1).
- DenseMap *SUnitMap;
+ // SUnitMap SDNode to SUnit mapping (n -> n).
+ DenseMap > *SUnitMap;
// SUnits - The SUnits for the current graph.
const std::vector *SUnits;
@@ -478,7 +795,7 @@
explicit BURegReductionPriorityQueue(const TargetInstrInfo *tii)
: TII(tii) {}
- void initNodes(DenseMap &sumap,
+ void initNodes(DenseMap > &sumap,
std::vector &sunits) {
SUnitMap = &sumap;
SUnits = &sunits;
@@ -488,6 +805,16 @@
CalculateSethiUllmanNumbers();
}
+ void addNode(const SUnit *SU) {
+ SethiUllmanNumbers.resize(SUnits->size(), 0);
+ CalcNodeSethiUllmanNumber(SU);
+ }
+
+ void updateNode(const SUnit *SU) {
+ SethiUllmanNumbers[SU->NodeNum] = 0;
+ CalcNodeSethiUllmanNumber(SU);
+ }
+
void releaseState() {
SUnits = 0;
SethiUllmanNumbers.clear();
@@ -519,18 +846,6 @@
return SethiUllmanNumbers[SU->NodeNum];
}
- bool isDUOperand(const SUnit *SU1, const SUnit *SU2) {
- unsigned Opc = SU1->Node->getTargetOpcode();
- unsigned NumRes = TII->getNumDefs(Opc);
- unsigned NumOps = ScheduleDAG::CountOperands(SU1->Node);
- for (unsigned i = 0; i != NumOps; ++i) {
- if (TII->getOperandConstraint(Opc, i+NumRes, TOI::TIED_TO) == -1)
- continue;
- if (SU1->Node->getOperand(i).isOperand(SU2->Node))
- return true;
- }
- return false;
- }
private:
bool canClobber(SUnit *SU, SUnit *Op);
void AddPseudoTwoAddrDeps();
@@ -542,8 +857,8 @@
template
class VISIBILITY_HIDDEN TDRegReductionPriorityQueue
: public RegReductionPriorityQueue {
- // SUnitMap SDNode to SUnit mapping (n -> 1).
- DenseMap *SUnitMap;
+ // SUnitMap SDNode to SUnit mapping (n -> n).
+ DenseMap > *SUnitMap;
// SUnits - The SUnits for the current graph.
const std::vector *SUnits;
@@ -554,7 +869,7 @@
public:
TDRegReductionPriorityQueue() {}
- void initNodes(DenseMap &sumap,
+ void initNodes(DenseMap > &sumap,
std::vector &sunits) {
SUnitMap = &sumap;
SUnits = &sunits;
@@ -562,6 +877,16 @@
CalculateSethiUllmanNumbers();
}
+ void addNode(const SUnit *SU) {
+ SethiUllmanNumbers.resize(SUnits->size(), 0);
+ CalcNodeSethiUllmanNumber(SU);
+ }
+
+ void updateNode(const SUnit *SU) {
+ SethiUllmanNumbers[SU->NodeNum] = 0;
+ CalcNodeSethiUllmanNumber(SU);
+ }
+
void releaseState() {
SUnits = 0;
SethiUllmanNumbers.clear();
@@ -710,7 +1035,7 @@
for (unsigned i = 0; i != NumOps; ++i) {
if (TII->getOperandConstraint(Opc, i+NumRes, TOI::TIED_TO) != -1) {
SDNode *DU = SU->Node->getOperand(i).Val;
- if (Op == (*SUnitMap)[DU])
+ if (Op == (*SUnitMap)[DU][SU->InstanceNo])
return true;
}
}
@@ -740,23 +1065,25 @@
for (unsigned j = 0; j != NumOps; ++j) {
if (TII->getOperandConstraint(Opc, j+NumRes, TOI::TIED_TO) != -1) {
SDNode *DU = SU->Node->getOperand(j).Val;
- SUnit *DUSU = (*SUnitMap)[DU];
+ SUnit *DUSU = (*SUnitMap)[DU][SU->InstanceNo];
if (!DUSU) continue;
for (SUnit::succ_iterator I = DUSU->Succs.begin(),E = DUSU->Succs.end();
I != E; ++I) {
if (I->isCtrl) continue;
SUnit *SuccSU = I->Dep;
- if (SuccSU != SU &&
- (!canClobber(SuccSU, DUSU) ||
- (!SU->isCommutable && SuccSU->isCommutable))){
- if (SuccSU->Depth == SU->Depth && !isReachable(SuccSU, SU)) {
- DOUT << "Adding an edge from SU # " << SU->NodeNum
- << " to SU #" << SuccSU->NodeNum << "\n";
- if (SU->addPred(SuccSU, true))
- SU->NumChainPredsLeft++;
- if (SuccSU->addSucc(SU, true))
- SuccSU->NumChainSuccsLeft++;
- }
+ // Don't constraint nodes with implicit defs. It can create cycles
+ // plus it may increase register pressures.
+ if (SuccSU == SU || SuccSU->hasImplicitDefs)
+ continue;
+ // Be conservative. Ignore if nodes aren't at the same depth.
+ if (SuccSU->Depth != SU->Depth)
+ continue;
+ if ((!canClobber(SuccSU, DUSU) ||
+ (!SU->isCommutable && SuccSU->isCommutable)) &&
+ !isReachable(SuccSU, SU)) {
+ DOUT << "Adding an edge from SU # " << SU->NodeNum
+ << " to SU #" << SuccSU->NodeNum << "\n";
+ SU->addPred(SuccSU, true, true);
}
}
}
@@ -783,7 +1110,7 @@
SethiUllmanNumber = PredSethiUllman;
Extra = 0;
} else if (PredSethiUllman == SethiUllmanNumber && !I->isCtrl)
- Extra++;
+ ++Extra;
}
SethiUllmanNumber += Extra;
@@ -813,7 +1140,7 @@
EE = SuccSU->Preds.end(); II != EE; ++II) {
SUnit *PredSU = II->Dep;
if (!PredSU->isScheduled)
- Sum++;
+ ++Sum;
}
}
@@ -906,7 +1233,7 @@
SethiUllmanNumber = PredSethiUllman;
Extra = 0;
} else if (PredSethiUllman == SethiUllmanNumber && !I->isCtrl)
- Extra++;
+ ++Extra;
}
SethiUllmanNumber += Extra;
Modified: llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGSimple.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGSimple.cpp?rev=42284&r1=42283&r2=42284&view=diff
==============================================================================
--- llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGSimple.cpp (original)
+++ llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGSimple.cpp Mon Sep 24 20:54:36 2007
@@ -697,9 +697,9 @@
NodeInfo *NI = Ordering[i];
if (NI->isInGroup()) {
NodeGroupIterator NGI(Ordering[i]);
- while (NodeInfo *NI = NGI.next()) EmitNode(NI->Node, VRBaseMap);
+ while (NodeInfo *NI = NGI.next()) EmitNode(NI->Node, 0, VRBaseMap);
} else {
- EmitNode(NI->Node, VRBaseMap);
+ EmitNode(NI->Node, 0, VRBaseMap);
}
}
}
Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGPrinter.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGPrinter.cpp?rev=42284&r1=42283&r2=42284&view=diff
==============================================================================
--- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGPrinter.cpp (original)
+++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGPrinter.cpp Mon Sep 24 20:54:36 2007
@@ -281,7 +281,7 @@
GraphWriter &GW) {
GW.emitSimpleNode(0, "plaintext=circle", "GraphRoot");
if (G->DAG.getRoot().Val)
- GW.emitEdge(0, -1, G->SUnitMap[G->DAG.getRoot().Val], -1, "");
+ GW.emitEdge(0, -1, G->SUnitMap[G->DAG.getRoot().Val].front(), -1, "");
}
};
}
From evan.cheng at apple.com Mon Sep 24 18:57:46 2007
From: evan.cheng at apple.com (Evan Cheng)
Date: Tue, 25 Sep 2007 01:57:46 -0000
Subject: [llvm-commits] [llvm] r42285 - in /llvm/trunk/lib/Target/X86:
X86FloatingPoint.cpp X86ISelLowering.cpp X86ISelLowering.h
X86InstrFPStack.td X86InstrInfo.cpp X86InstrInfo.td X86InstrSSE.td
X86InstrX86-64.td X86RegisterInfo.cpp
Message-ID: <200709250157.l8P1vlIT008976@zion.cs.uiuc.edu>
Author: evancheng
Date: Mon Sep 24 20:57:46 2007
New Revision: 42285
URL: http://llvm.org/viewvc/llvm-project?rev=42285&view=rev
Log:
Added support for new condition code modeling scheme (i.e. physical register dependency). These are a bunch of instructions that are duplicated so the x86 backend can support both the old and new schemes at the same time. They will be deleted after
all the kinks are worked out.
Modified:
llvm/trunk/lib/Target/X86/X86FloatingPoint.cpp
llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
llvm/trunk/lib/Target/X86/X86ISelLowering.h
llvm/trunk/lib/Target/X86/X86InstrFPStack.td
llvm/trunk/lib/Target/X86/X86InstrInfo.cpp
llvm/trunk/lib/Target/X86/X86InstrInfo.td
llvm/trunk/lib/Target/X86/X86InstrSSE.td
llvm/trunk/lib/Target/X86/X86InstrX86-64.td
llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp
Modified: llvm/trunk/lib/Target/X86/X86FloatingPoint.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86FloatingPoint.cpp?rev=42285&r1=42284&r2=42285&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86FloatingPoint.cpp (original)
+++ llvm/trunk/lib/Target/X86/X86FloatingPoint.cpp Mon Sep 24 20:57:46 2007
@@ -437,6 +437,33 @@
{ X86::MUL_FpI32m32 , X86::MUL_FI32m },
{ X86::MUL_FpI32m64 , X86::MUL_FI32m },
{ X86::MUL_FpI32m80 , X86::MUL_FI32m },
+
+ // TEMPORARY
+ { X86::NEW_CMOVBE_Fp32 , X86::CMOVBE_F },
+ { X86::NEW_CMOVBE_Fp64 , X86::CMOVBE_F },
+ { X86::NEW_CMOVBE_Fp80 , X86::CMOVBE_F },
+ { X86::NEW_CMOVB_Fp32 , X86::CMOVB_F },
+ { X86::NEW_CMOVB_Fp64 , X86::CMOVB_F },
+ { X86::NEW_CMOVB_Fp80 , X86::CMOVB_F },
+ { X86::NEW_CMOVE_Fp32 , X86::CMOVE_F },
+ { X86::NEW_CMOVE_Fp64 , X86::CMOVE_F },
+ { X86::NEW_CMOVE_Fp80 , X86::CMOVE_F },
+ { X86::NEW_CMOVNBE_Fp32 , X86::CMOVNBE_F },
+ { X86::NEW_CMOVNBE_Fp64 , X86::CMOVNBE_F },
+ { X86::NEW_CMOVNBE_Fp80 , X86::CMOVNBE_F },
+ { X86::NEW_CMOVNB_Fp32 , X86::CMOVNB_F },
+ { X86::NEW_CMOVNB_Fp64 , X86::CMOVNB_F },
+ { X86::NEW_CMOVNB_Fp80 , X86::CMOVNB_F },
+ { X86::NEW_CMOVNE_Fp32 , X86::CMOVNE_F },
+ { X86::NEW_CMOVNE_Fp64 , X86::CMOVNE_F },
+ { X86::NEW_CMOVNE_Fp80 , X86::CMOVNE_F },
+ { X86::NEW_CMOVNP_Fp32 , X86::CMOVNP_F },
+ { X86::NEW_CMOVNP_Fp64 , X86::CMOVNP_F },
+ { X86::NEW_CMOVNP_Fp80 , X86::CMOVNP_F },
+ { X86::NEW_CMOVP_Fp32 , X86::CMOVP_F },
+ { X86::NEW_CMOVP_Fp64 , X86::CMOVP_F },
+ { X86::NEW_CMOVP_Fp80 , X86::CMOVP_F },
+
{ X86::SIN_Fp32 , X86::SIN_F },
{ X86::SIN_Fp64 , X86::SIN_F },
{ X86::SIN_Fp80 , X86::SIN_F },
Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=42285&r1=42284&r2=42285&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)
+++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Mon Sep 24 20:57:46 2007
@@ -31,6 +31,7 @@
#include "llvm/CodeGen/MachineInstrBuilder.h"
#include "llvm/CodeGen/SelectionDAG.h"
#include "llvm/CodeGen/SSARegMap.h"
+#include "llvm/Support/CommandLine.h"
#include "llvm/Support/MathExtras.h"
#include "llvm/Target/TargetAsmInfo.h"
#include "llvm/Target/TargetOptions.h"
@@ -3337,41 +3338,58 @@
SDOperand AndNode = DAG.getNode(ISD::AND, MVT::i8, ShAmt,
DAG.getConstant(32, MVT::i8));
SDOperand COps[]={DAG.getEntryNode(), AndNode, DAG.getConstant(0, MVT::i8)};
- SDOperand InFlag = DAG.getNode(X86ISD::CMP, VTs, 2, COps, 3).getValue(1);
+ SDOperand Cond = NewCCModeling
+ ? DAG.getNode(X86ISD::CMP_NEW, MVT::i32,
+ AndNode, DAG.getConstant(0, MVT::i8))
+ : DAG.getNode(X86ISD::CMP, VTs, 2, COps, 3).getValue(1);
SDOperand Hi, Lo;
SDOperand CC = DAG.getConstant(X86::COND_NE, MVT::i8);
-
+ unsigned Opc = NewCCModeling ? X86ISD::CMOV_NEW : X86ISD::CMOV;
VTs = DAG.getNodeValueTypes(MVT::i32, MVT::Flag);
SmallVector Ops;
if (Op.getOpcode() == ISD::SHL_PARTS) {
Ops.push_back(Tmp2);
Ops.push_back(Tmp3);
Ops.push_back(CC);
- Ops.push_back(InFlag);
- Hi = DAG.getNode(X86ISD::CMOV, VTs, 2, &Ops[0], Ops.size());
- InFlag = Hi.getValue(1);
+ Ops.push_back(Cond);
+ if (NewCCModeling)
+ Hi = DAG.getNode(Opc, MVT::i32, &Ops[0], Ops.size());
+ else {
+ Hi = DAG.getNode(Opc, VTs, 2, &Ops[0], Ops.size());
+ Cond = Hi.getValue(1);
+ }
Ops.clear();
Ops.push_back(Tmp3);
Ops.push_back(Tmp1);
Ops.push_back(CC);
- Ops.push_back(InFlag);
- Lo = DAG.getNode(X86ISD::CMOV, VTs, 2, &Ops[0], Ops.size());
+ Ops.push_back(Cond);
+ if (NewCCModeling)
+ Lo = DAG.getNode(Opc, MVT::i32, &Ops[0], Ops.size());
+ else
+ Lo = DAG.getNode(Opc, VTs, 2, &Ops[0], Ops.size());
} else {
Ops.push_back(Tmp2);
Ops.push_back(Tmp3);
Ops.push_back(CC);
- Ops.push_back(InFlag);
- Lo = DAG.getNode(X86ISD::CMOV, VTs, 2, &Ops[0], Ops.size());
- InFlag = Lo.getValue(1);
+ Ops.push_back(Cond);
+ if (NewCCModeling)
+ Lo = DAG.getNode(Opc, MVT::i32, &Ops[0], Ops.size());
+ else {
+ Lo = DAG.getNode(Opc, VTs, 2, &Ops[0], Ops.size());
+ Cond = Lo.getValue(1);
+ }
Ops.clear();
Ops.push_back(Tmp3);
Ops.push_back(Tmp1);
Ops.push_back(CC);
- Ops.push_back(InFlag);
- Hi = DAG.getNode(X86ISD::CMOV, VTs, 2, &Ops[0], Ops.size());
+ Ops.push_back(Cond);
+ if (NewCCModeling)
+ Hi = DAG.getNode(Opc, MVT::i32, &Ops[0], Ops.size());
+ else
+ Hi = DAG.getNode(Opc, VTs, 2, &Ops[0], Ops.size());
}
VTs = DAG.getNodeValueTypes(MVT::i32, MVT::i32);
@@ -3668,6 +3686,43 @@
}
}
+SDOperand X86TargetLowering::LowerSETCC_New(SDOperand Op, SelectionDAG &DAG) {
+ assert(Op.getValueType() == MVT::i8 && "SetCC type must be 8-bit integer");
+ SDOperand Op0 = Op.getOperand(0);
+ SDOperand Op1 = Op.getOperand(1);
+ SDOperand CC = Op.getOperand(2);
+ ISD::CondCode SetCCOpcode = cast(CC)->get();
+ bool isFP = MVT::isFloatingPoint(Op.getOperand(1).getValueType());
+ unsigned X86CC;
+
+ SDOperand Cond = DAG.getNode(X86ISD::CMP_NEW, MVT::i32, Op0, Op1);
+ if (translateX86CC(cast(CC)->get(), isFP, X86CC,
+ Op0, Op1, DAG))
+ return DAG.getNode(X86ISD::SETCC_NEW, MVT::i8,
+ DAG.getConstant(X86CC, MVT::i8), Cond);
+
+ assert(isFP && "Illegal integer SetCC!");
+
+ switch (SetCCOpcode) {
+ default: assert(false && "Illegal floating point SetCC!");
+ case ISD::SETOEQ: { // !PF & ZF
+ SDOperand Tmp1 = DAG.getNode(X86ISD::SETCC_NEW, MVT::i8,
+ DAG.getConstant(X86::COND_NP, MVT::i8), Cond);
+ SDOperand Tmp2 = DAG.getNode(X86ISD::SETCC_NEW, MVT::i8,
+ DAG.getConstant(X86::COND_E, MVT::i8), Cond);
+ return DAG.getNode(ISD::AND, MVT::i8, Tmp1, Tmp2);
+ }
+ case ISD::SETUNE: { // PF | !ZF
+ SDOperand Tmp1 = DAG.getNode(X86ISD::SETCC_NEW, MVT::i8,
+ DAG.getConstant(X86::COND_P, MVT::i8), Cond);
+ SDOperand Tmp2 = DAG.getNode(X86ISD::SETCC_NEW, MVT::i8,
+ DAG.getConstant(X86::COND_NE, MVT::i8), Cond);
+ return DAG.getNode(ISD::OR, MVT::i8, Tmp1, Tmp2);
+ }
+ }
+}
+
+
SDOperand X86TargetLowering::LowerSELECT(SDOperand Op, SelectionDAG &DAG) {
bool addTest = true;
SDOperand Chain = DAG.getEntryNode();
@@ -3718,6 +3773,56 @@
return DAG.getNode(X86ISD::CMOV, VTs, 2, &Ops[0], Ops.size());
}
+SDOperand X86TargetLowering::LowerSELECT_New(SDOperand Op, SelectionDAG &DAG) {
+ bool addTest = true;
+ SDOperand Cond = Op.getOperand(0);
+ SDOperand CC;
+
+ if (Cond.getOpcode() == ISD::SETCC)
+ Cond = LowerSETCC_New(Cond, DAG);
+
+ if (Cond.getOpcode() == X86ISD::SETCC_NEW) {
+ CC = Cond.getOperand(0);
+
+ // If condition flag is set by a X86ISD::CMP, then make a copy of it
+ // (since flag operand cannot be shared). Use it as the condition setting
+ // operand in place of the X86ISD::SETCC.
+ // If the X86ISD::SETCC has more than one use, then perhaps it's better
+ // to use a test instead of duplicating the X86ISD::CMP (for register
+ // pressure reason)?
+ SDOperand Cmp = Cond.getOperand(1);
+ unsigned Opc = Cmp.getOpcode();
+ bool IllegalFPCMov =
+ ! ((X86ScalarSSEf32 && Op.getValueType()==MVT::f32) ||
+ (X86ScalarSSEf64 && Op.getValueType()==MVT::f64)) &&
+ !hasFPCMov(cast(CC)->getSignExtended());
+ if ((Opc == X86ISD::CMP_NEW ||
+ Opc == X86ISD::COMI_NEW ||
+ Opc == X86ISD::UCOMI_NEW) &&
+ !IllegalFPCMov) {
+ Cond = DAG.getNode(Opc, MVT::i32, Cmp.getOperand(0), Cmp.getOperand(1));
+ addTest = false;
+ }
+ }
+
+ if (addTest) {
+ CC = DAG.getConstant(X86::COND_NE, MVT::i8);
+ Cond = DAG.getNode(X86ISD::CMP_NEW, MVT::i32, Cond,
+ DAG.getConstant(0, MVT::i8));
+ }
+
+ const MVT::ValueType *VTs = DAG.getNodeValueTypes(Op.getValueType(),
+ MVT::Flag);
+ SmallVector Ops;
+ // X86ISD::CMOV means set the result (which is operand 1) to the RHS if
+ // condition is true.
+ Ops.push_back(Op.getOperand(2));
+ Ops.push_back(Op.getOperand(1));
+ Ops.push_back(CC);
+ Ops.push_back(Cond);
+ return DAG.getNode(X86ISD::CMOV_NEW, VTs, 2, &Ops[0], Ops.size());
+}
+
SDOperand X86TargetLowering::LowerBRCOND(SDOperand Op, SelectionDAG &DAG) {
bool addTest = true;
SDOperand Chain = Op.getOperand(0);
@@ -3756,6 +3861,43 @@
Cond, Op.getOperand(2), CC, Cond.getValue(1));
}
+SDOperand X86TargetLowering::LowerBRCOND_New(SDOperand Op, SelectionDAG &DAG) {
+ bool addTest = true;
+ SDOperand Chain = Op.getOperand(0);
+ SDOperand Cond = Op.getOperand(1);
+ SDOperand Dest = Op.getOperand(2);
+ SDOperand CC;
+
+ if (Cond.getOpcode() == ISD::SETCC)
+ Cond = LowerSETCC_New(Cond, DAG);
+
+ if (Cond.getOpcode() == X86ISD::SETCC_NEW) {
+ CC = Cond.getOperand(0);
+
+ // If condition flag is set by a X86ISD::CMP, then make a copy of it
+ // (since flag operand cannot be shared). Use it as the condition setting
+ // operand in place of the X86ISD::SETCC.
+ // If the X86ISD::SETCC has more than one use, then perhaps it's better
+ // to use a test instead of duplicating the X86ISD::CMP (for register
+ // pressure reason)?
+ SDOperand Cmp = Cond.getOperand(1);
+ unsigned Opc = Cmp.getOpcode();
+ if (Opc == X86ISD::CMP_NEW ||
+ Opc == X86ISD::COMI_NEW ||
+ Opc == X86ISD::UCOMI_NEW) {
+ Cond = DAG.getNode(Opc, MVT::i32, Cmp.getOperand(0), Cmp.getOperand(1));
+ addTest = false;
+ }
+ }
+
+ if (addTest) {
+ CC = DAG.getConstant(X86::COND_NE, MVT::i8);
+ Cond= DAG.getNode(X86ISD::CMP_NEW, MVT::i32, Cond, DAG.getConstant(0, MVT::i8));
+ }
+ return DAG.getNode(X86ISD::BRCOND_NEW, Op.getValueType(),
+ Chain, Op.getOperand(2), CC, Cond);
+}
+
SDOperand X86TargetLowering::LowerCALL(SDOperand Op, SelectionDAG &DAG) {
unsigned CallingConv= cast(Op.getOperand(1))->getValue();
@@ -4355,13 +4497,21 @@
SDOperand RHS = Op.getOperand(2);
translateX86CC(CC, true, X86CC, LHS, RHS, DAG);
- const MVT::ValueType *VTs = DAG.getNodeValueTypes(MVT::Other, MVT::Flag);
- SDOperand Ops1[] = { DAG.getEntryNode(), LHS, RHS };
- SDOperand Cond = DAG.getNode(Opc, VTs, 2, Ops1, 3);
- VTs = DAG.getNodeValueTypes(MVT::i8, MVT::Flag);
- SDOperand Ops2[] = { DAG.getConstant(X86CC, MVT::i8), Cond };
- SDOperand SetCC = DAG.getNode(X86ISD::SETCC, VTs, 2, Ops2, 2);
- return DAG.getNode(ISD::ANY_EXTEND, MVT::i32, SetCC);
+ if (NewCCModeling) {
+ Opc = (Opc == X86ISD::UCOMI) ? X86ISD::UCOMI_NEW : X86ISD::COMI_NEW;
+ SDOperand Cond = DAG.getNode(Opc, MVT::i32, LHS, RHS);
+ SDOperand SetCC = DAG.getNode(X86ISD::SETCC_NEW, MVT::i8,
+ DAG.getConstant(X86CC, MVT::i8), Cond);
+ return DAG.getNode(ISD::ANY_EXTEND, MVT::i32, SetCC);
+ } else {
+ const MVT::ValueType *VTs = DAG.getNodeValueTypes(MVT::Other, MVT::Flag);
+ SDOperand Ops1[] = { DAG.getEntryNode(), LHS, RHS };
+ SDOperand Cond = DAG.getNode(Opc, VTs, 2, Ops1, 3);
+ VTs = DAG.getNodeValueTypes(MVT::i8, MVT::Flag);
+ SDOperand Ops2[] = { DAG.getConstant(X86CC, MVT::i8), Cond };
+ SDOperand SetCC = DAG.getNode(X86ISD::SETCC, VTs, 2, Ops2, 2);
+ return DAG.getNode(ISD::ANY_EXTEND, MVT::i32, SetCC);
+ }
}
}
}
@@ -4529,9 +4679,15 @@
case ISD::FABS: return LowerFABS(Op, DAG);
case ISD::FNEG: return LowerFNEG(Op, DAG);
case ISD::FCOPYSIGN: return LowerFCOPYSIGN(Op, DAG);
- case ISD::SETCC: return LowerSETCC(Op, DAG, DAG.getEntryNode());
- case ISD::SELECT: return LowerSELECT(Op, DAG);
- case ISD::BRCOND: return LowerBRCOND(Op, DAG);
+ case ISD::SETCC: return NewCCModeling
+ ? LowerSETCC_New(Op, DAG)
+ : LowerSETCC(Op, DAG, DAG.getEntryNode());
+ case ISD::SELECT: return NewCCModeling
+ ? LowerSELECT_New(Op, DAG)
+ : LowerSELECT(Op, DAG);
+ case ISD::BRCOND: return NewCCModeling
+ ? LowerBRCOND_New(Op, DAG)
+ : LowerBRCOND(Op, DAG);
case ISD::JumpTable: return LowerJumpTable(Op, DAG);
case ISD::CALL: return LowerCALL(Op, DAG);
case ISD::RET: return LowerRET(Op, DAG);
@@ -4575,11 +4731,17 @@
case X86ISD::TAILCALL: return "X86ISD::TAILCALL";
case X86ISD::RDTSC_DAG: return "X86ISD::RDTSC_DAG";
case X86ISD::CMP: return "X86ISD::CMP";
+ case X86ISD::CMP_NEW: return "X86ISD::CMP_NEW";
case X86ISD::COMI: return "X86ISD::COMI";
+ case X86ISD::COMI_NEW: return "X86ISD::COMI_NEW";
case X86ISD::UCOMI: return "X86ISD::UCOMI";
+ case X86ISD::UCOMI_NEW: return "X86ISD::UCOMI_NEW";
case X86ISD::SETCC: return "X86ISD::SETCC";
+ case X86ISD::SETCC_NEW: return "X86ISD::SETCC_NEW";
case X86ISD::CMOV: return "X86ISD::CMOV";
+ case X86ISD::CMOV_NEW: return "X86ISD::CMOV_NEW";
case X86ISD::BRCOND: return "X86ISD::BRCOND";
+ case X86ISD::BRCOND_NEW: return "X86ISD::BRCOND_NEW";
case X86ISD::RET_FLAG: return "X86ISD::RET_FLAG";
case X86ISD::REP_STOS: return "X86ISD::REP_STOS";
case X86ISD::REP_MOVS: return "X86ISD::REP_MOVS";
@@ -4696,7 +4858,13 @@
case X86::CMOV_FR64:
case X86::CMOV_V4F32:
case X86::CMOV_V2F64:
- case X86::CMOV_V2I64: {
+ case X86::CMOV_V2I64:
+
+ case X86::NEW_CMOV_FR32:
+ case X86::NEW_CMOV_FR64:
+ case X86::NEW_CMOV_V4F32:
+ case X86::NEW_CMOV_V2F64:
+ case X86::NEW_CMOV_V2I64: {
// To "insert" a SELECT_CC instruction, we actually have to insert the
// diamond control-flow pattern. The incoming instruction knows the
// destination vreg to set, the condition code register to branch on, the
@@ -4853,6 +5021,7 @@
switch (Opc) {
default: break;
case X86ISD::SETCC:
+ case X86ISD::SETCC_NEW:
KnownZero |= (MVT::getIntVTBitMask(Op.getValueType()) ^ 1ULL);
break;
}
Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.h
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.h?rev=42285&r1=42284&r2=42285&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86ISelLowering.h (original)
+++ llvm/trunk/lib/Target/X86/X86ISelLowering.h Mon Sep 24 20:57:46 2007
@@ -117,22 +117,26 @@
/// X86 compare and logical compare instructions.
CMP, COMI, UCOMI,
+ CMP_NEW, COMI_NEW, UCOMI_NEW,
/// X86 SetCC. Operand 1 is condition code, and operand 2 is the flag
/// operand produced by a CMP instruction.
SETCC,
+ SETCC_NEW,
/// X86 conditional moves. Operand 1 and operand 2 are the two values
/// to select from (operand 1 is a R/W operand). Operand 3 is the
/// condition code, and operand 4 is the flag operand produced by a CMP
/// or TEST instruction. It also writes a flag result.
CMOV,
+ CMOV_NEW,
/// X86 conditional branches. Operand 1 is the chain operand, operand 2
/// is the block to branch if condition is true, operand 3 is the
/// condition code, and operand 4 is the flag operand produced by a CMP
/// or TEST instruction.
BRCOND,
+ BRCOND_NEW,
/// Return with a flag operand. Operand 1 is the chain operand, operand
/// 2 is the number of bytes of stack to pop.
@@ -422,8 +426,11 @@
SDOperand LowerFNEG(SDOperand Op, SelectionDAG &DAG);
SDOperand LowerFCOPYSIGN(SDOperand Op, SelectionDAG &DAG);
SDOperand LowerSETCC(SDOperand Op, SelectionDAG &DAG, SDOperand Chain);
+ SDOperand LowerSETCC_New(SDOperand Op, SelectionDAG &DAG);
SDOperand LowerSELECT(SDOperand Op, SelectionDAG &DAG);
+ SDOperand LowerSELECT_New(SDOperand Op, SelectionDAG &DAG);
SDOperand LowerBRCOND(SDOperand Op, SelectionDAG &DAG);
+ SDOperand LowerBRCOND_New(SDOperand Op, SelectionDAG &DAG);
SDOperand LowerMEMSET(SDOperand Op, SelectionDAG &DAG);
SDOperand LowerMEMCPY(SDOperand Op, SelectionDAG &DAG);
SDOperand LowerJumpTable(SDOperand Op, SelectionDAG &DAG);
Modified: llvm/trunk/lib/Target/X86/X86InstrFPStack.td
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFPStack.td?rev=42285&r1=42284&r2=42285&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86InstrFPStack.td (original)
+++ llvm/trunk/lib/Target/X86/X86InstrFPStack.td Mon Sep 24 20:57:46 2007
@@ -320,6 +320,31 @@
defm CMOVNP : FPCMov;
}
+multiclass NEW_FPCMov {
+ def _Fp32 : FpIf32<(outs RFP32:$dst), (ins RFP32:$src1, RFP32:$src2),
+ CondMovFP,
+ [(set RFP32:$dst, (X86cmov_new RFP32:$src1, RFP32:$src2,
+ cc, EFLAGS))]>;
+ def _Fp64 : FpIf64<(outs RFP64:$dst), (ins RFP64:$src1, RFP64:$src2),
+ CondMovFP,
+ [(set RFP64:$dst, (X86cmov_new RFP64:$src1, RFP64:$src2,
+ cc, EFLAGS))]>;
+ def _Fp80 : FpI_<(outs RFP80:$dst), (ins RFP80:$src1, RFP80:$src2),
+ CondMovFP,
+ [(set RFP80:$dst, (X86cmov_new RFP80:$src1, RFP80:$src2,
+ cc, EFLAGS))]>;
+}
+let Uses = [EFLAGS], isTwoAddress = 1 in {
+defm NEW_CMOVB : NEW_FPCMov;
+defm NEW_CMOVBE : NEW_FPCMov;
+defm NEW_CMOVE : NEW_FPCMov;
+defm NEW_CMOVP : NEW_FPCMov;
+defm NEW_CMOVNB : NEW_FPCMov;
+defm NEW_CMOVNBE: NEW_FPCMov;
+defm NEW_CMOVNE : NEW_FPCMov;
+defm NEW_CMOVNP : NEW_FPCMov;
+}
+
// These are not factored because there's no clean way to pass DA/DB.
def CMOVB_F : FPI<0xC0, AddRegFrm, (outs RST:$op), (ins),
"fcmovb\t{$op, %st(0)|%ST(0), $op}">, DA;
Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.cpp?rev=42285&r1=42284&r2=42285&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86InstrInfo.cpp (original)
+++ llvm/trunk/lib/Target/X86/X86InstrInfo.cpp Mon Sep 24 20:57:46 2007
@@ -21,6 +21,7 @@
#include "llvm/CodeGen/MachineInstrBuilder.h"
#include "llvm/CodeGen/LiveVariables.h"
#include "llvm/CodeGen/SSARegMap.h"
+#include "llvm/Target/TargetOptions.h"
using namespace llvm;
X86InstrInfo::X86InstrInfo(X86TargetMachine &tm)
@@ -385,28 +386,68 @@
case X86::JNP: return X86::COND_NP;
case X86::JO: return X86::COND_O;
case X86::JNO: return X86::COND_NO;
+ // TEMPORARY
+ case X86::NEW_JE: return X86::COND_E;
+ case X86::NEW_JNE: return X86::COND_NE;
+ case X86::NEW_JL: return X86::COND_L;
+ case X86::NEW_JLE: return X86::COND_LE;
+ case X86::NEW_JG: return X86::COND_G;
+ case X86::NEW_JGE: return X86::COND_GE;
+ case X86::NEW_JB: return X86::COND_B;
+ case X86::NEW_JBE: return X86::COND_BE;
+ case X86::NEW_JA: return X86::COND_A;
+ case X86::NEW_JAE: return X86::COND_AE;
+ case X86::NEW_JS: return X86::COND_S;
+ case X86::NEW_JNS: return X86::COND_NS;
+ case X86::NEW_JP: return X86::COND_P;
+ case X86::NEW_JNP: return X86::COND_NP;
+ case X86::NEW_JO: return X86::COND_O;
+ case X86::NEW_JNO: return X86::COND_NO;
+
}
}
unsigned X86::GetCondBranchFromCond(X86::CondCode CC) {
+ if (!NewCCModeling) {
+ switch (CC) {
+ default: assert(0 && "Illegal condition code!");
+ case X86::COND_E: return X86::JE;
+ case X86::COND_NE: return X86::JNE;
+ case X86::COND_L: return X86::JL;
+ case X86::COND_LE: return X86::JLE;
+ case X86::COND_G: return X86::JG;
+ case X86::COND_GE: return X86::JGE;
+ case X86::COND_B: return X86::JB;
+ case X86::COND_BE: return X86::JBE;
+ case X86::COND_A: return X86::JA;
+ case X86::COND_AE: return X86::JAE;
+ case X86::COND_S: return X86::JS;
+ case X86::COND_NS: return X86::JNS;
+ case X86::COND_P: return X86::JP;
+ case X86::COND_NP: return X86::JNP;
+ case X86::COND_O: return X86::JO;
+ case X86::COND_NO: return X86::JNO;
+ }
+ }
+
switch (CC) {
default: assert(0 && "Illegal condition code!");
- case X86::COND_E: return X86::JE;
- case X86::COND_NE: return X86::JNE;
- case X86::COND_L: return X86::JL;
- case X86::COND_LE: return X86::JLE;
- case X86::COND_G: return X86::JG;
- case X86::COND_GE: return X86::JGE;
- case X86::COND_B: return X86::JB;
- case X86::COND_BE: return X86::JBE;
- case X86::COND_A: return X86::JA;
- case X86::COND_AE: return X86::JAE;
- case X86::COND_S: return X86::JS;
- case X86::COND_NS: return X86::JNS;
- case X86::COND_P: return X86::JP;
- case X86::COND_NP: return X86::JNP;
- case X86::COND_O: return X86::JO;
- case X86::COND_NO: return X86::JNO;
+ case X86::COND_E: return X86::NEW_JE;
+ case X86::COND_NE: return X86::NEW_JNE;
+ case X86::COND_L: return X86::NEW_JL;
+ case X86::COND_LE: return X86::NEW_JLE;
+ case X86::COND_G: return X86::NEW_JG;
+ case X86::COND_GE: return X86::NEW_JGE;
+ case X86::COND_B: return X86::NEW_JB;
+ case X86::COND_BE: return X86::NEW_JBE;
+ case X86::COND_A: return X86::NEW_JA;
+ case X86::COND_AE: return X86::NEW_JAE;
+ case X86::COND_S: return X86::NEW_JS;
+ case X86::COND_NS: return X86::NEW_JNS;
+ case X86::COND_P: return X86::NEW_JP;
+ case X86::COND_NP: return X86::NEW_JNP;
+ case X86::COND_O: return X86::NEW_JO;
+ case X86::COND_NO: return X86::NEW_JNO;
}
}
Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.td
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.td?rev=42285&r1=42284&r2=42285&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86InstrInfo.td (original)
+++ llvm/trunk/lib/Target/X86/X86InstrInfo.td Mon Sep 24 20:57:46 2007
@@ -26,12 +26,21 @@
def SDTX86Cmov : SDTypeProfile<1, 3,
[SDTCisSameAs<0, 1>, SDTCisSameAs<1, 2>,
SDTCisVT<3, i8>]>;
+def SDTX86Cmov_NEW : SDTypeProfile<1, 4,
+ [SDTCisSameAs<0, 1>, SDTCisSameAs<1, 2>,
+ SDTCisVT<3, i8>, SDTCisVT<4, i32>]>;
def SDTX86BrCond : SDTypeProfile<0, 2,
[SDTCisVT<0, OtherVT>, SDTCisVT<1, i8>]>;
+def SDTX86BrCond_NEW : SDTypeProfile<0, 3,
+ [SDTCisVT<0, OtherVT>,
+ SDTCisVT<1, i8>, SDTCisVT<2, i32>]>;
def SDTX86SetCC : SDTypeProfile<1, 1,
[SDTCisVT<0, i8>, SDTCisVT<1, i8>]>;
+def SDTX86SetCC_NEW : SDTypeProfile<1, 2,
+ [SDTCisVT<0, i8>,
+ SDTCisVT<1, i8>, SDTCisVT<2, i32>]>;
def SDTX86Ret : SDTypeProfile<0, 1, [SDTCisVT<0, i16>]>;
@@ -58,13 +67,18 @@
def X86cmp : SDNode<"X86ISD::CMP" , SDTX86CmpTest,
[SDNPHasChain, SDNPOutFlag]>;
+def X86cmp_new : SDNode<"X86ISD::CMP_NEW" , SDTX86CmpTest>;
-def X86cmov : SDNode<"X86ISD::CMOV", SDTX86Cmov,
+def X86cmov : SDNode<"X86ISD::CMOV", SDTX86Cmov,
[SDNPInFlag, SDNPOutFlag]>;
+def X86cmov_new: SDNode<"X86ISD::CMOV_NEW", SDTX86Cmov_NEW>;
def X86brcond : SDNode<"X86ISD::BRCOND", SDTX86BrCond,
[SDNPHasChain, SDNPInFlag]>;
+def X86brcond_new : SDNode<"X86ISD::BRCOND_NEW", SDTX86BrCond_NEW,
+ [SDNPHasChain]>;
def X86setcc : SDNode<"X86ISD::SETCC", SDTX86SetCC,
[SDNPInFlag, SDNPOutFlag]>;
+def X86setcc_new : SDNode<"X86ISD::SETCC_NEW", SDTX86SetCC_NEW>;
def X86retflag : SDNode<"X86ISD::RET_FLAG", SDTX86Ret,
[SDNPHasChain, SDNPOptInFlag]>;
@@ -301,6 +315,7 @@
}
// Conditional branches
+let Uses = [EFLAGS] in {
def JE : IBr<0x84, (ins brtarget:$dst), "je\t$dst",
[(X86brcond bb:$dst, X86_COND_E)]>, TB;
def JNE : IBr<0x85, (ins brtarget:$dst), "jne\t$dst",
@@ -335,6 +350,44 @@
[(X86brcond bb:$dst, X86_COND_O)]>, TB;
def JNO : IBr<0x81, (ins brtarget:$dst), "jno\t$dst",
[(X86brcond bb:$dst, X86_COND_NO)]>, TB;
+} // Uses = [EFLAGS]
+
+let Uses = [EFLAGS] in {
+def NEW_JE : IBr<0x84, (ins brtarget:$dst), "je\t$dst",
+ [(X86brcond_new bb:$dst, X86_COND_E, EFLAGS)]>, TB;
+def NEW_JNE : IBr<0x85, (ins brtarget:$dst), "jne\t$dst",
+ [(X86brcond_new bb:$dst, X86_COND_NE, EFLAGS)]>, TB;
+def NEW_JL : IBr<0x8C, (ins brtarget:$dst), "jl\t$dst",
+ [(X86brcond_new bb:$dst, X86_COND_L, EFLAGS)]>, TB;
+def NEW_JLE : IBr<0x8E, (ins brtarget:$dst), "jle\t$dst",
+ [(X86brcond_new bb:$dst, X86_COND_LE, EFLAGS)]>, TB;
+def NEW_JG : IBr<0x8F, (ins brtarget:$dst), "jg\t$dst",
+ [(X86brcond_new bb:$dst, X86_COND_G, EFLAGS)]>, TB;
+def NEW_JGE : IBr<0x8D, (ins brtarget:$dst), "jge\t$dst",
+ [(X86brcond_new bb:$dst, X86_COND_GE, EFLAGS)]>, TB;
+
+def NEW_JB : IBr<0x82, (ins brtarget:$dst), "jb\t$dst",
+ [(X86brcond_new bb:$dst, X86_COND_B, EFLAGS)]>, TB;
+def NEW_JBE : IBr<0x86, (ins brtarget:$dst), "jbe\t$dst",
+ [(X86brcond_new bb:$dst, X86_COND_BE, EFLAGS)]>, TB;
+def NEW_JA : IBr<0x87, (ins brtarget:$dst), "ja\t$dst",
+ [(X86brcond_new bb:$dst, X86_COND_A, EFLAGS)]>, TB;
+def NEW_JAE : IBr<0x83, (ins brtarget:$dst), "jae\t$dst",
+ [(X86brcond_new bb:$dst, X86_COND_AE, EFLAGS)]>, TB;
+
+def NEW_JS : IBr<0x88, (ins brtarget:$dst), "js\t$dst",
+ [(X86brcond_new bb:$dst, X86_COND_S, EFLAGS)]>, TB;
+def NEW_JNS : IBr<0x89, (ins brtarget:$dst), "jns\t$dst",
+ [(X86brcond_new bb:$dst, X86_COND_NS, EFLAGS)]>, TB;
+def NEW_JP : IBr<0x8A, (ins brtarget:$dst), "jp\t$dst",
+ [(X86brcond_new bb:$dst, X86_COND_P, EFLAGS)]>, TB;
+def NEW_JNP : IBr<0x8B, (ins brtarget:$dst), "jnp\t$dst",
+ [(X86brcond_new bb:$dst, X86_COND_NP, EFLAGS)]>, TB;
+def NEW_JO : IBr<0x80, (ins brtarget:$dst), "jo\t$dst",
+ [(X86brcond_new bb:$dst, X86_COND_O, EFLAGS)]>, TB;
+def NEW_JNO : IBr<0x81, (ins brtarget:$dst), "jno\t$dst",
+ [(X86brcond_new bb:$dst, X86_COND_NO, EFLAGS)]>, TB;
+} // Uses = [EFLAGS]
//===----------------------------------------------------------------------===//
// Call Instructions...
@@ -343,7 +396,7 @@
// All calls clobber the non-callee saved registers...
let Defs = [EAX, ECX, EDX, FP0, FP1, FP2, FP3, FP4, FP5, FP6, ST0,
MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7,
- XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7] in {
+ XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, EFLAGS] in {
def CALLpcrel32 : I<0xE8, RawFrm, (outs), (ins i32imm:$dst, variable_ops),
"call\t${dst:call}", []>;
def CALL32r : I<0xFF, MRM2r, (outs), (ins GR32:$dst, variable_ops),
@@ -640,6 +693,7 @@
let isTwoAddress = 1 in {
// Conditional moves
+let Uses = [EFLAGS] in {
def CMOVB16rr : I<0x42, MRMSrcReg, // if ,
+ TB, OpSize;
+def NEW_CMOVB16rm : I<0x42, MRMSrcMem, // if ,
+ TB, OpSize;
+def NEW_CMOVB32rr : I<0x42, MRMSrcReg, // if ,
+ TB;
+def NEW_CMOVB32rm : I<0x42, MRMSrcMem, // if ,
+ TB;
+
+def NEW_CMOVAE16rr: I<0x43, MRMSrcReg, // if >=u, GR16 = GR16
+ (outs GR16:$dst), (ins GR16:$src1, GR16:$src2),
+ "cmovae\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, GR16:$src2,
+ X86_COND_AE, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVAE16rm: I<0x43, MRMSrcMem, // if >=u, GR16 = [mem16]
+ (outs GR16:$dst), (ins GR16:$src1, i16mem:$src2),
+ "cmovae\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, (loadi16 addr:$src2),
+ X86_COND_AE, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVAE32rr: I<0x43, MRMSrcReg, // if >=u, GR32 = GR32
+ (outs GR32:$dst), (ins GR32:$src1, GR32:$src2),
+ "cmovae\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, GR32:$src2,
+ X86_COND_AE, EFLAGS))]>,
+ TB;
+def NEW_CMOVAE32rm: I<0x43, MRMSrcMem, // if >=u, GR32 = [mem32]
+ (outs GR32:$dst), (ins GR32:$src1, i32mem:$src2),
+ "cmovae\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, (loadi32 addr:$src2),
+ X86_COND_AE, EFLAGS))]>,
+ TB;
+
+def NEW_CMOVE16rr : I<0x44, MRMSrcReg, // if ==, GR16 = GR16
+ (outs GR16:$dst), (ins GR16:$src1, GR16:$src2),
+ "cmove\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, GR16:$src2,
+ X86_COND_E, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVE16rm : I<0x44, MRMSrcMem, // if ==, GR16 = [mem16]
+ (outs GR16:$dst), (ins GR16:$src1, i16mem:$src2),
+ "cmove\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, (loadi16 addr:$src2),
+ X86_COND_E, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVE32rr : I<0x44, MRMSrcReg, // if ==, GR32 = GR32
+ (outs GR32:$dst), (ins GR32:$src1, GR32:$src2),
+ "cmove\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, GR32:$src2,
+ X86_COND_E, EFLAGS))]>,
+ TB;
+def NEW_CMOVE32rm : I<0x44, MRMSrcMem, // if ==, GR32 = [mem32]
+ (outs GR32:$dst), (ins GR32:$src1, i32mem:$src2),
+ "cmove\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, (loadi32 addr:$src2),
+ X86_COND_E, EFLAGS))]>,
+ TB;
+
+def NEW_CMOVNE16rr: I<0x45, MRMSrcReg, // if !=, GR16 = GR16
+ (outs GR16:$dst), (ins GR16:$src1, GR16:$src2),
+ "cmovne\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, GR16:$src2,
+ X86_COND_NE, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVNE16rm: I<0x45, MRMSrcMem, // if !=, GR16 = [mem16]
+ (outs GR16:$dst), (ins GR16:$src1, i16mem:$src2),
+ "cmovne\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, (loadi16 addr:$src2),
+ X86_COND_NE, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVNE32rr: I<0x45, MRMSrcReg, // if !=, GR32 = GR32
+ (outs GR32:$dst), (ins GR32:$src1, GR32:$src2),
+ "cmovne\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, GR32:$src2,
+ X86_COND_NE, EFLAGS))]>,
+ TB;
+def NEW_CMOVNE32rm: I<0x45, MRMSrcMem, // if !=, GR32 = [mem32]
+ (outs GR32:$dst), (ins GR32:$src1, i32mem:$src2),
+ "cmovne\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, (loadi32 addr:$src2),
+ X86_COND_NE, EFLAGS))]>,
+ TB;
+
+def NEW_CMOVBE16rr: I<0x46, MRMSrcReg, // if <=u, GR16 = GR16
+ (outs GR16:$dst), (ins GR16:$src1, GR16:$src2),
+ "cmovbe\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, GR16:$src2,
+ X86_COND_BE, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVBE16rm: I<0x46, MRMSrcMem, // if <=u, GR16 = [mem16]
+ (outs GR16:$dst), (ins GR16:$src1, i16mem:$src2),
+ "cmovbe\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, (loadi16 addr:$src2),
+ X86_COND_BE, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVBE32rr: I<0x46, MRMSrcReg, // if <=u, GR32 = GR32
+ (outs GR32:$dst), (ins GR32:$src1, GR32:$src2),
+ "cmovbe\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, GR32:$src2,
+ X86_COND_BE, EFLAGS))]>,
+ TB;
+def NEW_CMOVBE32rm: I<0x46, MRMSrcMem, // if <=u, GR32 = [mem32]
+ (outs GR32:$dst), (ins GR32:$src1, i32mem:$src2),
+ "cmovbe\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, (loadi32 addr:$src2),
+ X86_COND_BE, EFLAGS))]>,
+ TB;
+
+def NEW_CMOVA16rr : I<0x47, MRMSrcReg, // if >u, GR16 = GR16
+ (outs GR16:$dst), (ins GR16:$src1, GR16:$src2),
+ "cmova\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, GR16:$src2,
+ X86_COND_A, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVA16rm : I<0x47, MRMSrcMem, // if >u, GR16 = [mem16]
+ (outs GR16:$dst), (ins GR16:$src1, i16mem:$src2),
+ "cmova\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, (loadi16 addr:$src2),
+ X86_COND_A, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVA32rr : I<0x47, MRMSrcReg, // if >u, GR32 = GR32
+ (outs GR32:$dst), (ins GR32:$src1, GR32:$src2),
+ "cmova\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, GR32:$src2,
+ X86_COND_A, EFLAGS))]>,
+ TB;
+def NEW_CMOVA32rm : I<0x47, MRMSrcMem, // if >u, GR32 = [mem32]
+ (outs GR32:$dst), (ins GR32:$src1, i32mem:$src2),
+ "cmova\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, (loadi32 addr:$src2),
+ X86_COND_A, EFLAGS))]>,
+ TB;
+
+def NEW_CMOVL16rr : I<0x4C, MRMSrcReg, // if ,
+ TB, OpSize;
+def NEW_CMOVL16rm : I<0x4C, MRMSrcMem, // if ,
+ TB, OpSize;
+def NEW_CMOVL32rr : I<0x4C, MRMSrcReg, // if ,
+ TB;
+def NEW_CMOVL32rm : I<0x4C, MRMSrcMem, // if ,
+ TB;
+
+def NEW_CMOVGE16rr: I<0x4D, MRMSrcReg, // if >=s, GR16 = GR16
+ (outs GR16:$dst), (ins GR16:$src1, GR16:$src2),
+ "cmovge\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, GR16:$src2,
+ X86_COND_GE, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVGE16rm: I<0x4D, MRMSrcMem, // if >=s, GR16 = [mem16]
+ (outs GR16:$dst), (ins GR16:$src1, i16mem:$src2),
+ "cmovge\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, (loadi16 addr:$src2),
+ X86_COND_GE, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVGE32rr: I<0x4D, MRMSrcReg, // if >=s, GR32 = GR32
+ (outs GR32:$dst), (ins GR32:$src1, GR32:$src2),
+ "cmovge\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, GR32:$src2,
+ X86_COND_GE, EFLAGS))]>,
+ TB;
+def NEW_CMOVGE32rm: I<0x4D, MRMSrcMem, // if >=s, GR32 = [mem32]
+ (outs GR32:$dst), (ins GR32:$src1, i32mem:$src2),
+ "cmovge\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, (loadi32 addr:$src2),
+ X86_COND_GE, EFLAGS))]>,
+ TB;
+
+def NEW_CMOVLE16rr: I<0x4E, MRMSrcReg, // if <=s, GR16 = GR16
+ (outs GR16:$dst), (ins GR16:$src1, GR16:$src2),
+ "cmovle\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, GR16:$src2,
+ X86_COND_LE, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVLE16rm: I<0x4E, MRMSrcMem, // if <=s, GR16 = [mem16]
+ (outs GR16:$dst), (ins GR16:$src1, i16mem:$src2),
+ "cmovle\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, (loadi16 addr:$src2),
+ X86_COND_LE, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVLE32rr: I<0x4E, MRMSrcReg, // if <=s, GR32 = GR32
+ (outs GR32:$dst), (ins GR32:$src1, GR32:$src2),
+ "cmovle\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, GR32:$src2,
+ X86_COND_LE, EFLAGS))]>,
+ TB;
+def NEW_CMOVLE32rm: I<0x4E, MRMSrcMem, // if <=s, GR32 = [mem32]
+ (outs GR32:$dst), (ins GR32:$src1, i32mem:$src2),
+ "cmovle\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, (loadi32 addr:$src2),
+ X86_COND_LE, EFLAGS))]>,
+ TB;
+
+def NEW_CMOVG16rr : I<0x4F, MRMSrcReg, // if >s, GR16 = GR16
+ (outs GR16:$dst), (ins GR16:$src1, GR16:$src2),
+ "cmovg\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, GR16:$src2,
+ X86_COND_G, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVG16rm : I<0x4F, MRMSrcMem, // if >s, GR16 = [mem16]
+ (outs GR16:$dst), (ins GR16:$src1, i16mem:$src2),
+ "cmovg\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, (loadi16 addr:$src2),
+ X86_COND_G, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVG32rr : I<0x4F, MRMSrcReg, // if >s, GR32 = GR32
+ (outs GR32:$dst), (ins GR32:$src1, GR32:$src2),
+ "cmovg\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, GR32:$src2,
+ X86_COND_G, EFLAGS))]>,
+ TB;
+def NEW_CMOVG32rm : I<0x4F, MRMSrcMem, // if >s, GR32 = [mem32]
+ (outs GR32:$dst), (ins GR32:$src1, i32mem:$src2),
+ "cmovg\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, (loadi32 addr:$src2),
+ X86_COND_G, EFLAGS))]>,
+ TB;
+
+def NEW_CMOVS16rr : I<0x48, MRMSrcReg, // if signed, GR16 = GR16
+ (outs GR16:$dst), (ins GR16:$src1, GR16:$src2),
+ "cmovs\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, GR16:$src2,
+ X86_COND_S, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVS16rm : I<0x48, MRMSrcMem, // if signed, GR16 = [mem16]
+ (outs GR16:$dst), (ins GR16:$src1, i16mem:$src2),
+ "cmovs\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, (loadi16 addr:$src2),
+ X86_COND_S, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVS32rr : I<0x48, MRMSrcReg, // if signed, GR32 = GR32
+ (outs GR32:$dst), (ins GR32:$src1, GR32:$src2),
+ "cmovs\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, GR32:$src2,
+ X86_COND_S, EFLAGS))]>,
+ TB;
+def NEW_CMOVS32rm : I<0x48, MRMSrcMem, // if signed, GR32 = [mem32]
+ (outs GR32:$dst), (ins GR32:$src1, i32mem:$src2),
+ "cmovs\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, (loadi32 addr:$src2),
+ X86_COND_S, EFLAGS))]>,
+ TB;
+
+def NEW_CMOVNS16rr: I<0x49, MRMSrcReg, // if !signed, GR16 = GR16
+ (outs GR16:$dst), (ins GR16:$src1, GR16:$src2),
+ "cmovns\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, GR16:$src2,
+ X86_COND_NS, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVNS16rm: I<0x49, MRMSrcMem, // if !signed, GR16 = [mem16]
+ (outs GR16:$dst), (ins GR16:$src1, i16mem:$src2),
+ "cmovns\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, (loadi16 addr:$src2),
+ X86_COND_NS, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVNS32rr: I<0x49, MRMSrcReg, // if !signed, GR32 = GR32
+ (outs GR32:$dst), (ins GR32:$src1, GR32:$src2),
+ "cmovns\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, GR32:$src2,
+ X86_COND_NS, EFLAGS))]>,
+ TB;
+def NEW_CMOVNS32rm: I<0x49, MRMSrcMem, // if !signed, GR32 = [mem32]
+ (outs GR32:$dst), (ins GR32:$src1, i32mem:$src2),
+ "cmovns\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, (loadi32 addr:$src2),
+ X86_COND_NS, EFLAGS))]>,
+ TB;
+
+def NEW_CMOVP16rr : I<0x4A, MRMSrcReg, // if parity, GR16 = GR16
+ (outs GR16:$dst), (ins GR16:$src1, GR16:$src2),
+ "cmovp\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, GR16:$src2,
+ X86_COND_P, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVP16rm : I<0x4A, MRMSrcMem, // if parity, GR16 = [mem16]
+ (outs GR16:$dst), (ins GR16:$src1, i16mem:$src2),
+ "cmovp\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, (loadi16 addr:$src2),
+ X86_COND_P, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVP32rr : I<0x4A, MRMSrcReg, // if parity, GR32 = GR32
+ (outs GR32:$dst), (ins GR32:$src1, GR32:$src2),
+ "cmovp\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, GR32:$src2,
+ X86_COND_P, EFLAGS))]>,
+ TB;
+def NEW_CMOVP32rm : I<0x4A, MRMSrcMem, // if parity, GR32 = [mem32]
+ (outs GR32:$dst), (ins GR32:$src1, i32mem:$src2),
+ "cmovp\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, (loadi32 addr:$src2),
+ X86_COND_P, EFLAGS))]>,
+ TB;
+
+def NEW_CMOVNP16rr : I<0x4B, MRMSrcReg, // if !parity, GR16 = GR16
+ (outs GR16:$dst), (ins GR16:$src1, GR16:$src2),
+ "cmovnp\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, GR16:$src2,
+ X86_COND_NP, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVNP16rm : I<0x4B, MRMSrcMem, // if !parity, GR16 = [mem16]
+ (outs GR16:$dst), (ins GR16:$src1, i16mem:$src2),
+ "cmovnp\t{$src2, $dst|$dst, $src2}",
+ [(set GR16:$dst, (X86cmov_new GR16:$src1, (loadi16 addr:$src2),
+ X86_COND_NP, EFLAGS))]>,
+ TB, OpSize;
+def NEW_CMOVNP32rr : I<0x4B, MRMSrcReg, // if !parity, GR32 = GR32
+ (outs GR32:$dst), (ins GR32:$src1, GR32:$src2),
+ "cmovnp\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, GR32:$src2,
+ X86_COND_NP, EFLAGS))]>,
+ TB;
+def NEW_CMOVNP32rm : I<0x4B, MRMSrcMem, // if !parity, GR32 = [mem32]
+ (outs GR32:$dst), (ins GR32:$src1, i32mem:$src2),
+ "cmovnp\t{$src2, $dst|$dst, $src2}",
+ [(set GR32:$dst, (X86cmov_new GR32:$src1, (loadi32 addr:$src2),
+ X86_COND_NP, EFLAGS))]>,
+ TB;
+} // Uses = [EFLAGS]
+
+
// unary instructions
let CodeSize = 2 in {
let Defs = [EFLAGS] in {
@@ -2028,7 +2434,7 @@
//===----------------------------------------------------------------------===//
// Test instructions are just like AND, except they don't generate a result.
//
- let Defs = [EFLAGS] in {
+let Defs = [EFLAGS] in {
let isCommutable = 1 in { // TEST X, Y --> TEST Y, X
def TEST8rr : I<0x84, MRMDestReg, (outs), (ins GR8:$src1, GR8:$src2),
"test{b}\t{$src2, $src1|$src1, $src2}",
@@ -2081,12 +2487,77 @@
} // Defs = [EFLAGS]
+let Defs = [EFLAGS] in {
+let isCommutable = 1 in { // TEST X, Y --> TEST Y, X
+def NEW_TEST8rr : I<0x84, MRMDestReg, (outs), (ins GR8:$src1, GR8:$src2),
+ "test{b}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (and GR8:$src1, GR8:$src2), 0),
+ (implicit EFLAGS)]>;
+def NEW_TEST16rr : I<0x85, MRMDestReg, (outs), (ins GR16:$src1, GR16:$src2),
+ "test{w}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (and GR16:$src1, GR16:$src2), 0),
+ (implicit EFLAGS)]>,
+ OpSize;
+def NEW_TEST32rr : I<0x85, MRMDestReg, (outs), (ins GR32:$src1, GR32:$src2),
+ "test{l}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (and GR32:$src1, GR32:$src2), 0),
+ (implicit EFLAGS)]>;
+}
+
+def NEW_TEST8rm : I<0x84, MRMSrcMem, (outs), (ins GR8 :$src1, i8mem :$src2),
+ "test{b}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (and GR8:$src1, (loadi8 addr:$src2)), 0),
+ (implicit EFLAGS)]>;
+def NEW_TEST16rm : I<0x85, MRMSrcMem, (outs), (ins GR16:$src1, i16mem:$src2),
+ "test{w}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (and GR16:$src1, (loadi16 addr:$src2)), 0),
+ (implicit EFLAGS)]>, OpSize;
+def NEW_TEST32rm : I<0x85, MRMSrcMem, (outs), (ins GR32:$src1, i32mem:$src2),
+ "test{l}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (and GR32:$src1, (loadi32 addr:$src2)), 0),
+ (implicit EFLAGS)]>;
+
+def NEW_TEST8ri : Ii8 <0xF6, MRM0r, // flags = GR8 & imm8
+ (outs), (ins GR8:$src1, i8imm:$src2),
+ "test{b}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (and GR8:$src1, imm:$src2), 0),
+ (implicit EFLAGS)]>;
+def NEW_TEST16ri : Ii16<0xF7, MRM0r, // flags = GR16 & imm16
+ (outs), (ins GR16:$src1, i16imm:$src2),
+ "test{w}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (and GR16:$src1, imm:$src2), 0),
+ (implicit EFLAGS)]>, OpSize;
+def NEW_TEST32ri : Ii32<0xF7, MRM0r, // flags = GR32 & imm32
+ (outs), (ins GR32:$src1, i32imm:$src2),
+ "test{l}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (and GR32:$src1, imm:$src2), 0),
+ (implicit EFLAGS)]>;
+
+def NEW_TEST8mi : Ii8 <0xF6, MRM0m, // flags = [mem8] & imm8
+ (outs), (ins i8mem:$src1, i8imm:$src2),
+ "test{b}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (and (loadi8 addr:$src1), imm:$src2), 0),
+ (implicit EFLAGS)]>;
+def NEW_TEST16mi : Ii16<0xF7, MRM0m, // flags = [mem16] & imm16
+ (outs), (ins i16mem:$src1, i16imm:$src2),
+ "test{w}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (and (loadi16 addr:$src1), imm:$src2), 0),
+ (implicit EFLAGS)]>, OpSize;
+def NEW_TEST32mi : Ii32<0xF7, MRM0m, // flags = [mem32] & imm32
+ (outs), (ins i32mem:$src1, i32imm:$src2),
+ "test{l}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (and (loadi32 addr:$src1), imm:$src2), 0),
+ (implicit EFLAGS)]>;
+} // Defs = [EFLAGS]
+
+
// Condition code ops, incl. set if equal/not equal/...
let Defs = [EFLAGS], Uses = [AH] in
def SAHF : I<0x9E, RawFrm, (outs), (ins), "sahf", []>; // flags = AH
let Defs = [AH], Uses = [EFLAGS] in
def LAHF : I<0x9F, RawFrm, (outs), (ins), "lahf", []>; // AH = flags
+let Uses = [EFLAGS] in {
def SETEr : I<0x94, MRM0r,
(outs GR8 :$dst), (ins),
"sete\t$dst",
@@ -2229,6 +2700,155 @@
"setnp\t$dst",
[(store (X86setcc X86_COND_NP), addr:$dst)]>,
TB; // [mem8] = not parity
+} // Uses = [EFLAGS]
+
+let Uses = [EFLAGS] in {
+def NEW_SETEr : I<0x94, MRM0r,
+ (outs GR8 :$dst), (ins),
+ "sete\t$dst",
+ [(set GR8:$dst, (X86setcc_new X86_COND_E, EFLAGS))]>,
+ TB; // GR8 = ==
+def NEW_SETEm : I<0x94, MRM0m,
+ (outs), (ins i8mem:$dst),
+ "sete\t$dst",
+ [(store (X86setcc_new X86_COND_E, EFLAGS), addr:$dst)]>,
+ TB; // [mem8] = ==
+def NEW_SETNEr : I<0x95, MRM0r,
+ (outs GR8 :$dst), (ins),
+ "setne\t$dst",
+ [(set GR8:$dst, (X86setcc_new X86_COND_NE, EFLAGS))]>,
+ TB; // GR8 = !=
+def NEW_SETNEm : I<0x95, MRM0m,
+ (outs), (ins i8mem:$dst),
+ "setne\t$dst",
+ [(store (X86setcc_new X86_COND_NE, EFLAGS), addr:$dst)]>,
+ TB; // [mem8] = !=
+def NEW_SETLr : I<0x9C, MRM0r,
+ (outs GR8 :$dst), (ins),
+ "setl\t$dst",
+ [(set GR8:$dst, (X86setcc_new X86_COND_L, EFLAGS))]>,
+ TB; // GR8 = < signed
+def NEW_SETLm : I<0x9C, MRM0m,
+ (outs), (ins i8mem:$dst),
+ "setl\t$dst",
+ [(store (X86setcc_new X86_COND_L, EFLAGS), addr:$dst)]>,
+ TB; // [mem8] = < signed
+def NEW_SETGEr : I<0x9D, MRM0r,
+ (outs GR8 :$dst), (ins),
+ "setge\t$dst",
+ [(set GR8:$dst, (X86setcc_new X86_COND_GE, EFLAGS))]>,
+ TB; // GR8 = >= signed
+def NEW_SETGEm : I<0x9D, MRM0m,
+ (outs), (ins i8mem:$dst),
+ "setge\t$dst",
+ [(store (X86setcc_new X86_COND_GE, EFLAGS), addr:$dst)]>,
+ TB; // [mem8] = >= signed
+def NEW_SETLEr : I<0x9E, MRM0r,
+ (outs GR8 :$dst), (ins),
+ "setle\t$dst",
+ [(set GR8:$dst, (X86setcc_new X86_COND_LE, EFLAGS))]>,
+ TB; // GR8 = <= signed
+def NEW_SETLEm : I<0x9E, MRM0m,
+ (outs), (ins i8mem:$dst),
+ "setle\t$dst",
+ [(store (X86setcc_new X86_COND_LE, EFLAGS), addr:$dst)]>,
+ TB; // [mem8] = <= signed
+def NEW_SETGr : I<0x9F, MRM0r,
+ (outs GR8 :$dst), (ins),
+ "setg\t$dst",
+ [(set GR8:$dst, (X86setcc_new X86_COND_G, EFLAGS))]>,
+ TB; // GR8 = > signed
+def NEW_SETGm : I<0x9F, MRM0m,
+ (outs), (ins i8mem:$dst),
+ "setg\t$dst",
+ [(store (X86setcc_new X86_COND_G, EFLAGS), addr:$dst)]>,
+ TB; // [mem8] = > signed
+
+def NEW_SETBr : I<0x92, MRM0r,
+ (outs GR8 :$dst), (ins),
+ "setb\t$dst",
+ [(set GR8:$dst, (X86setcc_new X86_COND_B, EFLAGS))]>,
+ TB; // GR8 = < unsign
+def NEW_SETBm : I<0x92, MRM0m,
+ (outs), (ins i8mem:$dst),
+ "setb\t$dst",
+ [(store (X86setcc_new X86_COND_B, EFLAGS), addr:$dst)]>,
+ TB; // [mem8] = < unsign
+def NEW_SETAEr : I<0x93, MRM0r,
+ (outs GR8 :$dst), (ins),
+ "setae\t$dst",
+ [(set GR8:$dst, (X86setcc_new X86_COND_AE, EFLAGS))]>,
+ TB; // GR8 = >= unsign
+def NEW_SETAEm : I<0x93, MRM0m,
+ (outs), (ins i8mem:$dst),
+ "setae\t$dst",
+ [(store (X86setcc_new X86_COND_AE, EFLAGS), addr:$dst)]>,
+ TB; // [mem8] = >= unsign
+def NEW_SETBEr : I<0x96, MRM0r,
+ (outs GR8 :$dst), (ins),
+ "setbe\t$dst",
+ [(set GR8:$dst, (X86setcc_new X86_COND_BE, EFLAGS))]>,
+ TB; // GR8 = <= unsign
+def NEW_SETBEm : I<0x96, MRM0m,
+ (outs), (ins i8mem:$dst),
+ "setbe\t$dst",
+ [(store (X86setcc_new X86_COND_BE, EFLAGS), addr:$dst)]>,
+ TB; // [mem8] = <= unsign
+def NEW_SETAr : I<0x97, MRM0r,
+ (outs GR8 :$dst), (ins),
+ "seta\t$dst",
+ [(set GR8:$dst, (X86setcc_new X86_COND_A, EFLAGS))]>,
+ TB; // GR8 = > signed
+def NEW_SETAm : I<0x97, MRM0m,
+ (outs), (ins i8mem:$dst),
+ "seta\t$dst",
+ [(store (X86setcc_new X86_COND_A, EFLAGS), addr:$dst)]>,
+ TB; // [mem8] = > signed
+
+def NEW_SETSr : I<0x98, MRM0r,
+ (outs GR8 :$dst), (ins),
+ "sets\t$dst",
+ [(set GR8:$dst, (X86setcc_new X86_COND_S, EFLAGS))]>,
+ TB; // GR8 =
+def NEW_SETSm : I<0x98, MRM0m,
+ (outs), (ins i8mem:$dst),
+ "sets\t$dst",
+ [(store (X86setcc_new X86_COND_S, EFLAGS), addr:$dst)]>,
+ TB; // [mem8] =
+def NEW_SETNSr : I<0x99, MRM0r,
+ (outs GR8 :$dst), (ins),
+ "setns\t$dst",
+ [(set GR8:$dst, (X86setcc_new X86_COND_NS, EFLAGS))]>,
+ TB; // GR8 = !
+def NEW_SETNSm : I<0x99, MRM0m,
+ (outs), (ins i8mem:$dst),
+ "setns\t$dst",
+ [(store (X86setcc_new X86_COND_NS, EFLAGS), addr:$dst)]>,
+ TB; // [mem8] = !
+def NEW_SETPr : I<0x9A, MRM0r,
+ (outs GR8 :$dst), (ins),
+ "setp\t$dst",
+ [(set GR8:$dst, (X86setcc_new X86_COND_P, EFLAGS))]>,
+ TB; // GR8 = parity
+def NEW_SETPm : I<0x9A, MRM0m,
+ (outs), (ins i8mem:$dst),
+ "setp\t$dst",
+ [(store (X86setcc_new X86_COND_P, EFLAGS), addr:$dst)]>,
+ TB; // [mem8] = parity
+def NEW_SETNPr : I<0x9B, MRM0r,
+ (outs GR8 :$dst), (ins),
+ "setnp\t$dst",
+ [(set GR8:$dst, (X86setcc_new X86_COND_NP, EFLAGS))]>,
+ TB; // GR8 = not parity
+def NEW_SETNPm : I<0x9B, MRM0m,
+ (outs), (ins i8mem:$dst),
+ "setnp\t$dst",
+ [(store (X86setcc_new X86_COND_NP, EFLAGS), addr:$dst)]>,
+ TB; // [mem8] = not parity
+} // Uses = [EFLAGS]
+
+
+//def : Pat<(X86setcc_new X86_COND_E, EFLAGS), (SETEr)>;
// Integer comparisons
let Defs = [EFLAGS] in {
@@ -2310,6 +2930,99 @@
[(X86cmp GR32:$src1, i32immSExt8:$src2)]>;
} // Defs = [EFLAGS]
+let Defs = [EFLAGS] in {
+def NEW_CMP8rr : I<0x38, MRMDestReg,
+ (outs), (ins GR8 :$src1, GR8 :$src2),
+ "cmp{b}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new GR8:$src1, GR8:$src2), (implicit EFLAGS)]>;
+def NEW_CMP16rr : I<0x39, MRMDestReg,
+ (outs), (ins GR16:$src1, GR16:$src2),
+ "cmp{w}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new GR16:$src1, GR16:$src2), (implicit EFLAGS)]>, OpSize;
+def NEW_CMP32rr : I<0x39, MRMDestReg,
+ (outs), (ins GR32:$src1, GR32:$src2),
+ "cmp{l}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new GR32:$src1, GR32:$src2), (implicit EFLAGS)]>;
+def NEW_CMP8mr : I<0x38, MRMDestMem,
+ (outs), (ins i8mem :$src1, GR8 :$src2),
+ "cmp{b}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (loadi8 addr:$src1), GR8:$src2),
+ (implicit EFLAGS)]>;
+def NEW_CMP16mr : I<0x39, MRMDestMem,
+ (outs), (ins i16mem:$src1, GR16:$src2),
+ "cmp{w}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (loadi16 addr:$src1), GR16:$src2),
+ (implicit EFLAGS)]>, OpSize;
+def NEW_CMP32mr : I<0x39, MRMDestMem,
+ (outs), (ins i32mem:$src1, GR32:$src2),
+ "cmp{l}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (loadi32 addr:$src1), GR32:$src2),
+ (implicit EFLAGS)]>;
+def NEW_CMP8rm : I<0x3A, MRMSrcMem,
+ (outs), (ins GR8 :$src1, i8mem :$src2),
+ "cmp{b}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new GR8:$src1, (loadi8 addr:$src2)),
+ (implicit EFLAGS)]>;
+def NEW_CMP16rm : I<0x3B, MRMSrcMem,
+ (outs), (ins GR16:$src1, i16mem:$src2),
+ "cmp{w}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new GR16:$src1, (loadi16 addr:$src2)),
+ (implicit EFLAGS)]>, OpSize;
+def NEW_CMP32rm : I<0x3B, MRMSrcMem,
+ (outs), (ins GR32:$src1, i32mem:$src2),
+ "cmp{l}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new GR32:$src1, (loadi32 addr:$src2)),
+ (implicit EFLAGS)]>;
+def NEW_CMP8ri : Ii8<0x80, MRM7r,
+ (outs), (ins GR8:$src1, i8imm:$src2),
+ "cmp{b}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new GR8:$src1, imm:$src2), (implicit EFLAGS)]>;
+def NEW_CMP16ri : Ii16<0x81, MRM7r,
+ (outs), (ins GR16:$src1, i16imm:$src2),
+ "cmp{w}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new GR16:$src1, imm:$src2),
+ (implicit EFLAGS)]>, OpSize;
+def NEW_CMP32ri : Ii32<0x81, MRM7r,
+ (outs), (ins GR32:$src1, i32imm:$src2),
+ "cmp{l}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new GR32:$src1, imm:$src2), (implicit EFLAGS)]>;
+def NEW_CMP8mi : Ii8 <0x80, MRM7m,
+ (outs), (ins i8mem :$src1, i8imm :$src2),
+ "cmp{b}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (loadi8 addr:$src1), imm:$src2),
+ (implicit EFLAGS)]>;
+def NEW_CMP16mi : Ii16<0x81, MRM7m,
+ (outs), (ins i16mem:$src1, i16imm:$src2),
+ "cmp{w}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (loadi16 addr:$src1), imm:$src2),
+ (implicit EFLAGS)]>, OpSize;
+def NEW_CMP32mi : Ii32<0x81, MRM7m,
+ (outs), (ins i32mem:$src1, i32imm:$src2),
+ "cmp{l}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (loadi32 addr:$src1), imm:$src2),
+ (implicit EFLAGS)]>;
+def NEW_CMP16ri8 : Ii8<0x83, MRM7r,
+ (outs), (ins GR16:$src1, i16i8imm:$src2),
+ "cmp{w}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new GR16:$src1, i16immSExt8:$src2),
+ (implicit EFLAGS)]>, OpSize;
+def NEW_CMP16mi8 : Ii8<0x83, MRM7m,
+ (outs), (ins i16mem:$src1, i16i8imm:$src2),
+ "cmp{w}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (loadi16 addr:$src1), i16immSExt8:$src2),
+ (implicit EFLAGS)]>, OpSize;
+def NEW_CMP32mi8 : Ii8<0x83, MRM7m,
+ (outs), (ins i32mem:$src1, i32i8imm:$src2),
+ "cmp{l}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (loadi32 addr:$src1), i32immSExt8:$src2),
+ (implicit EFLAGS)]>;
+def NEW_CMP32ri8 : Ii8<0x83, MRM7r,
+ (outs), (ins GR32:$src1, i32i8imm:$src2),
+ "cmp{l}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new GR32:$src1, i32immSExt8:$src2),
+ (implicit EFLAGS)]>;
+} // Defs = [EFLAGS]
+
// Sign/Zero extenders
def MOVSX16rr8 : I<0xBE, MRMSrcReg, (outs GR16:$dst), (ins GR8 :$src),
"movs{bw|x}\t{$src, $dst|$dst, $src}",
@@ -2522,6 +3235,13 @@
def : Pat<(X86cmp GR32:$src1, 0),
(TEST32rr GR32:$src1, GR32:$src1)>;
+def : Pat<(parallel (X86cmp_new GR8:$src1, 0), (implicit EFLAGS)),
+ (NEW_TEST8rr GR8:$src1, GR8:$src1)>;
+def : Pat<(parallel (X86cmp_new GR16:$src1, 0), (implicit EFLAGS)),
+ (NEW_TEST16rr GR16:$src1, GR16:$src1)>;
+def : Pat<(parallel (X86cmp_new GR32:$src1, 0), (implicit EFLAGS)),
+ (NEW_TEST32rr GR32:$src1, GR32:$src1)>;
+
// {s|z}extload bool -> {s|z}extload byte
def : Pat<(sextloadi16i1 addr:$src), (MOVSX16rm8 addr:$src)>;
def : Pat<(sextloadi32i1 addr:$src), (MOVSX32rm8 addr:$src)>;
Modified: llvm/trunk/lib/Target/X86/X86InstrSSE.td
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrSSE.td?rev=42285&r1=42284&r2=42285&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86InstrSSE.td (original)
+++ llvm/trunk/lib/Target/X86/X86InstrSSE.td Mon Sep 24 20:57:46 2007
@@ -36,6 +36,9 @@
[SDNPHasChain, SDNPOutFlag]>;
def X86ucomi : SDNode<"X86ISD::UCOMI", SDTX86CmpTest,
[SDNPHasChain, SDNPOutFlag]>;
+def X86comi_new: SDNode<"X86ISD::COMI_NEW", SDTX86CmpTest,
+ [SDNPHasChain]>;
+def X86ucomi_new: SDNode<"X86ISD::UCOMI_NEW",SDTX86CmpTest>;
def X86s2vec : SDNode<"X86ISD::S2VEC", SDTypeProfile<1, 1, []>, []>;
def X86pextrw : SDNode<"X86ISD::PEXTRW", SDTypeProfile<1, 2, []>, []>;
def X86pinsrw : SDNode<"X86ISD::PINSRW", SDTypeProfile<1, 3, []>, []>;
@@ -263,7 +266,8 @@
// CMOV* - Used to implement the SSE SELECT DAG operation. Expanded by the
// scheduler into a branch sequence.
-let usesCustomDAGSchedInserter = 1 in { // Expanded by the scheduler.
+// These are expanded by the scheduler.
+let Uses = [EFLAGS], usesCustomDAGSchedInserter = 1 in {
def CMOV_FR32 : I<0, Pseudo,
(outs FR32:$dst), (ins FR32:$t, FR32:$f, i8imm:$cond),
"#CMOV_FR32 PSEUDO!",
@@ -287,6 +291,35 @@
"#CMOV_V2I64 PSEUDO!",
[(set VR128:$dst,
(v2i64 (X86cmov VR128:$t, VR128:$f, imm:$cond)))]>;
+
+ def NEW_CMOV_FR32 : I<0, Pseudo,
+ (outs FR32:$dst), (ins FR32:$t, FR32:$f, i8imm:$cond),
+ "#CMOV_FR32 PSEUDO!",
+ [(set FR32:$dst, (X86cmov_new FR32:$t, FR32:$f, imm:$cond,
+ EFLAGS))]>;
+ def NEW_CMOV_FR64 : I<0, Pseudo,
+ (outs FR64:$dst), (ins FR64:$t, FR64:$f, i8imm:$cond),
+ "#CMOV_FR64 PSEUDO!",
+ [(set FR64:$dst, (X86cmov_new FR64:$t, FR64:$f, imm:$cond,
+ EFLAGS))]>;
+ def NEW_CMOV_V4F32 : I<0, Pseudo,
+ (outs VR128:$dst), (ins VR128:$t, VR128:$f, i8imm:$cond),
+ "#CMOV_V4F32 PSEUDO!",
+ [(set VR128:$dst,
+ (v4f32 (X86cmov_new VR128:$t, VR128:$f, imm:$cond,
+ EFLAGS)))]>;
+ def NEW_CMOV_V2F64 : I<0, Pseudo,
+ (outs VR128:$dst), (ins VR128:$t, VR128:$f, i8imm:$cond),
+ "#CMOV_V2F64 PSEUDO!",
+ [(set VR128:$dst,
+ (v2f64 (X86cmov_new VR128:$t, VR128:$f, imm:$cond,
+ EFLAGS)))]>;
+ def NEW_CMOV_V2I64 : I<0, Pseudo,
+ (outs VR128:$dst), (ins VR128:$t, VR128:$f, i8imm:$cond),
+ "#CMOV_V2I64 PSEUDO!",
+ [(set VR128:$dst,
+ (v2i64 (X86cmov_new VR128:$t, VR128:$f, imm:$cond,
+ EFLAGS)))]>;
}
//===----------------------------------------------------------------------===//
@@ -367,6 +400,14 @@
def UCOMISSrm: PSI<0x2E, MRMSrcMem, (outs), (ins FR32:$src1, f32mem:$src2),
"ucomiss\t{$src2, $src1|$src1, $src2}",
[(X86cmp FR32:$src1, (loadf32 addr:$src2))]>;
+
+def NEW_UCOMISSrr: PSI<0x2E, MRMSrcReg, (outs), (ins FR32:$src1, FR32:$src2),
+ "ucomiss\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new FR32:$src1, FR32:$src2), (implicit EFLAGS)]>;
+def NEW_UCOMISSrm: PSI<0x2E, MRMSrcMem, (outs), (ins FR32:$src1, f32mem:$src2),
+ "ucomiss\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new FR32:$src1, (loadf32 addr:$src2)),
+ (implicit EFLAGS)]>;
} // Defs = [EFLAGS]
// Aliases to match intrinsics which expect XMM operand(s).
@@ -397,6 +438,28 @@
def Int_COMISSrm: PSI<0x2F, MRMSrcMem, (outs), (ins VR128:$src1, f128mem:$src2),
"comiss\t{$src2, $src1|$src1, $src2}",
[(X86comi (v4f32 VR128:$src1), (load addr:$src2))]>;
+
+def NEW_Int_UCOMISSrr: PSI<0x2E, MRMSrcReg, (outs),
+ (ins VR128:$src1, VR128:$src2),
+ "ucomiss\t{$src2, $src1|$src1, $src2}",
+ [(X86ucomi_new (v4f32 VR128:$src1), VR128:$src2),
+ (implicit EFLAGS)]>;
+def NEW_Int_UCOMISSrm: PSI<0x2E, MRMSrcMem, (outs),
+ (ins VR128:$src1, f128mem:$src2),
+ "ucomiss\t{$src2, $src1|$src1, $src2}",
+ [(X86ucomi_new (v4f32 VR128:$src1), (load addr:$src2)),
+ (implicit EFLAGS)]>;
+
+def NEW_Int_COMISSrr: PSI<0x2F, MRMSrcReg, (outs),
+ (ins VR128:$src1, VR128:$src2),
+ "comiss\t{$src2, $src1|$src1, $src2}",
+ [(X86comi_new (v4f32 VR128:$src1), VR128:$src2),
+ (implicit EFLAGS)]>;
+def NEW_Int_COMISSrm: PSI<0x2F, MRMSrcMem, (outs),
+ (ins VR128:$src1, f128mem:$src2),
+ "comiss\t{$src2, $src1|$src1, $src2}",
+ [(X86comi_new (v4f32 VR128:$src1), (load addr:$src2)),
+ (implicit EFLAGS)]>;
} // Defs = [EFLAGS]
// Aliases of packed SSE1 instructions for scalar use. These all have names that
@@ -1029,6 +1092,7 @@
"cmp${cc}sd\t{$src, $dst|$dst, $src}", []>;
}
+let Defs = [EFLAGS] in {
def UCOMISDrr: PDI<0x2E, MRMSrcReg, (outs), (ins FR64:$src1, FR64:$src2),
"ucomisd\t{$src2, $src1|$src1, $src2}",
[(X86cmp FR64:$src1, FR64:$src2)]>;
@@ -1036,6 +1100,15 @@
"ucomisd\t{$src2, $src1|$src1, $src2}",
[(X86cmp FR64:$src1, (loadf64 addr:$src2))]>;
+def NEW_UCOMISDrr: PDI<0x2E, MRMSrcReg, (outs), (ins FR64:$src1, FR64:$src2),
+ "ucomisd\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new FR64:$src1, FR64:$src2), (implicit EFLAGS)]>;
+def NEW_UCOMISDrm: PDI<0x2E, MRMSrcMem, (outs), (ins FR64:$src1, f64mem:$src2),
+ "ucomisd\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new FR64:$src1, (loadf64 addr:$src2)),
+ (implicit EFLAGS)]>;
+}
+
// Aliases to match intrinsics which expect XMM operand(s).
let isTwoAddress = 1 in {
def Int_CMPSDrr : SDI<0xC2, MRMSrcReg,
@@ -1050,6 +1123,7 @@
(load addr:$src), imm:$cc))]>;
}
+let Defs = [EFLAGS] in {
def Int_UCOMISDrr: PDI<0x2E, MRMSrcReg, (outs), (ins VR128:$src1, VR128:$src2),
"ucomisd\t{$src2, $src1|$src1, $src2}",
[(X86ucomi (v2f64 VR128:$src1), (v2f64 VR128:$src2))]>;
@@ -1064,6 +1138,29 @@
"comisd\t{$src2, $src1|$src1, $src2}",
[(X86comi (v2f64 VR128:$src1), (load addr:$src2))]>;
+def NEW_Int_UCOMISDrr: PDI<0x2E, MRMSrcReg, (outs),
+ (ins VR128:$src1, VR128:$src2),
+ "ucomisd\t{$src2, $src1|$src1, $src2}",
+ [(X86ucomi_new (v2f64 VR128:$src1), (v2f64 VR128:$src2)),
+ (implicit EFLAGS)]>;
+def NEW_Int_UCOMISDrm: PDI<0x2E, MRMSrcMem, (outs),
+ (ins VR128:$src1, f128mem:$src2),
+ "ucomisd\t{$src2, $src1|$src1, $src2}",
+ [(X86ucomi_new (v2f64 VR128:$src1), (load addr:$src2)),
+ (implicit EFLAGS)]>;
+
+def NEW_Int_COMISDrr: PDI<0x2F, MRMSrcReg, (outs),
+ (ins VR128:$src1, VR128:$src2),
+ "comisd\t{$src2, $src1|$src1, $src2}",
+ [(X86comi_new (v2f64 VR128:$src1), (v2f64 VR128:$src2)),
+ (implicit EFLAGS)]>;
+def NEW_Int_COMISDrm: PDI<0x2F, MRMSrcMem, (outs),
+ (ins VR128:$src1, f128mem:$src2),
+ "comisd\t{$src2, $src1|$src1, $src2}",
+ [(X86comi_new (v2f64 VR128:$src1), (load addr:$src2)),
+ (implicit EFLAGS)]>;
+} // Defs = EFLAGS]
+
// Aliases of packed SSE2 instructions for scalar use. These all have names that
// start with 'Fs'.
Modified: llvm/trunk/lib/Target/X86/X86InstrX86-64.td
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrX86-64.td?rev=42285&r1=42284&r2=42285&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86InstrX86-64.td (original)
+++ llvm/trunk/lib/Target/X86/X86InstrX86-64.td Mon Sep 24 20:57:46 2007
@@ -752,8 +752,60 @@
[(X86cmp GR64:$src1, i64immSExt8:$src2)]>;
} // Defs = [EFLAGS]
+let Defs = [EFLAGS] in {
+let isCommutable = 1 in
+def NEW_TEST64rr : RI<0x85, MRMDestReg, (outs), (ins GR64:$src1, GR64:$src2),
+ "test{q}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (and GR64:$src1, GR64:$src2), 0),
+ (implicit EFLAGS)]>;
+def NEW_TEST64rm : RI<0x85, MRMSrcMem, (outs), (ins GR64:$src1, i64mem:$src2),
+ "test{q}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (and GR64:$src1, (loadi64 addr:$src2)), 0),
+ (implicit EFLAGS)]>;
+def NEW_TEST64ri32 : RIi32<0xF7, MRM0r, (outs),
+ (ins GR64:$src1, i64i32imm:$src2),
+ "test{q}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (and GR64:$src1, i64immSExt32:$src2), 0),
+ (implicit EFLAGS)]>;
+def NEW_TEST64mi32 : RIi32<0xF7, MRM0m, (outs),
+ (ins i64mem:$src1, i64i32imm:$src2),
+ "test{q}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (and (loadi64 addr:$src1), i64immSExt32:$src2), 0),
+ (implicit EFLAGS)]>;
+
+def NEW_CMP64rr : RI<0x39, MRMDestReg, (outs), (ins GR64:$src1, GR64:$src2),
+ "cmp{q}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new GR64:$src1, GR64:$src2),
+ (implicit EFLAGS)]>;
+def NEW_CMP64mr : RI<0x39, MRMDestMem, (outs), (ins i64mem:$src1, GR64:$src2),
+ "cmp{q}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (loadi64 addr:$src1), GR64:$src2),
+ (implicit EFLAGS)]>;
+def NEW_CMP64rm : RI<0x3B, MRMSrcMem, (outs), (ins GR64:$src1, i64mem:$src2),
+ "cmp{q}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new GR64:$src1, (loadi64 addr:$src2)),
+ (implicit EFLAGS)]>;
+def NEW_CMP64ri32 : RIi32<0x81, MRM7r, (outs), (ins GR64:$src1, i64i32imm:$src2),
+ "cmp{q}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new GR64:$src1, i64immSExt32:$src2),
+ (implicit EFLAGS)]>;
+def NEW_CMP64mi32 : RIi32<0x81, MRM7m, (outs),
+ (ins i64mem:$src1, i64i32imm:$src2),
+ "cmp{q}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (loadi64 addr:$src1), i64immSExt32:$src2),
+ (implicit EFLAGS)]>;
+def NEW_CMP64mi8 : RIi8<0x83, MRM7m, (outs), (ins i64mem:$src1, i64i8imm:$src2),
+ "cmp{q}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new (loadi64 addr:$src1), i64immSExt8:$src2),
+ (implicit EFLAGS)]>;
+def NEW_CMP64ri8 : RIi8<0x83, MRM7r, (outs), (ins GR64:$src1, i64i8imm:$src2),
+ "cmp{q}\t{$src2, $src1|$src1, $src2}",
+ [(X86cmp_new GR64:$src1, i64immSExt8:$src2),
+ (implicit EFLAGS)]>;
+} // Defs = [EFLAGS]
+
// Conditional moves
-let isTwoAddress = 1 in {
+let Uses = [EFLAGS], isTwoAddress = 1 in {
def CMOVB64rr : RI<0x42, MRMSrcReg, // if , TB;
+ X86_COND_NP))]>, TB;
+
+def NEW_CMOVB64rr : RI<0x42, MRMSrcReg, // if , TB;
+def NEW_CMOVB64rm : RI<0x42, MRMSrcMem, // if , TB;
+def NEW_CMOVAE64rr: RI<0x43, MRMSrcReg, // if >=u, GR64 = GR64
+ (outs GR64:$dst), (ins GR64:$src1, GR64:$src2),
+ "cmovae\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, GR64:$src2,
+ X86_COND_AE, EFLAGS))]>, TB;
+def NEW_CMOVAE64rm: RI<0x43, MRMSrcMem, // if >=u, GR64 = [mem64]
+ (outs GR64:$dst), (ins GR64:$src1, i64mem:$src2),
+ "cmovae\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, (loadi64 addr:$src2),
+ X86_COND_AE, EFLAGS))]>, TB;
+def NEW_CMOVE64rr : RI<0x44, MRMSrcReg, // if ==, GR64 = GR64
+ (outs GR64:$dst), (ins GR64:$src1, GR64:$src2),
+ "cmove\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, GR64:$src2,
+ X86_COND_E, EFLAGS))]>, TB;
+def NEW_CMOVE64rm : RI<0x44, MRMSrcMem, // if ==, GR64 = [mem64]
+ (outs GR64:$dst), (ins GR64:$src1, i64mem:$src2),
+ "cmove\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, (loadi64 addr:$src2),
+ X86_COND_E, EFLAGS))]>, TB;
+def NEW_CMOVNE64rr: RI<0x45, MRMSrcReg, // if !=, GR64 = GR64
+ (outs GR64:$dst), (ins GR64:$src1, GR64:$src2),
+ "cmovne\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, GR64:$src2,
+ X86_COND_NE, EFLAGS))]>, TB;
+def NEW_CMOVNE64rm: RI<0x45, MRMSrcMem, // if !=, GR64 = [mem64]
+ (outs GR64:$dst), (ins GR64:$src1, i64mem:$src2),
+ "cmovne\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, (loadi64 addr:$src2),
+ X86_COND_NE, EFLAGS))]>, TB;
+def NEW_CMOVBE64rr: RI<0x46, MRMSrcReg, // if <=u, GR64 = GR64
+ (outs GR64:$dst), (ins GR64:$src1, GR64:$src2),
+ "cmovbe\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, GR64:$src2,
+ X86_COND_BE, EFLAGS))]>, TB;
+def NEW_CMOVBE64rm: RI<0x46, MRMSrcMem, // if <=u, GR64 = [mem64]
+ (outs GR64:$dst), (ins GR64:$src1, i64mem:$src2),
+ "cmovbe\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, (loadi64 addr:$src2),
+ X86_COND_BE, EFLAGS))]>, TB;
+def NEW_CMOVA64rr : RI<0x47, MRMSrcReg, // if >u, GR64 = GR64
+ (outs GR64:$dst), (ins GR64:$src1, GR64:$src2),
+ "cmova\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, GR64:$src2,
+ X86_COND_A, EFLAGS))]>, TB;
+def NEW_CMOVA64rm : RI<0x47, MRMSrcMem, // if >u, GR64 = [mem64]
+ (outs GR64:$dst), (ins GR64:$src1, i64mem:$src2),
+ "cmova\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, (loadi64 addr:$src2),
+ X86_COND_A, EFLAGS))]>, TB;
+def NEW_CMOVL64rr : RI<0x4C, MRMSrcReg, // if , TB;
+def NEW_CMOVL64rm : RI<0x4C, MRMSrcMem, // if , TB;
+def NEW_CMOVGE64rr: RI<0x4D, MRMSrcReg, // if >=s, GR64 = GR64
+ (outs GR64:$dst), (ins GR64:$src1, GR64:$src2),
+ "cmovge\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, GR64:$src2,
+ X86_COND_GE, EFLAGS))]>, TB;
+def NEW_CMOVGE64rm: RI<0x4D, MRMSrcMem, // if >=s, GR64 = [mem64]
+ (outs GR64:$dst), (ins GR64:$src1, i64mem:$src2),
+ "cmovge\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, (loadi64 addr:$src2),
+ X86_COND_GE, EFLAGS))]>, TB;
+def NEW_CMOVLE64rr: RI<0x4E, MRMSrcReg, // if <=s, GR64 = GR64
+ (outs GR64:$dst), (ins GR64:$src1, GR64:$src2),
+ "cmovle\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, GR64:$src2,
+ X86_COND_LE, EFLAGS))]>, TB;
+def NEW_CMOVLE64rm: RI<0x4E, MRMSrcMem, // if <=s, GR64 = [mem64]
+ (outs GR64:$dst), (ins GR64:$src1, i64mem:$src2),
+ "cmovle\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, (loadi64 addr:$src2),
+ X86_COND_LE, EFLAGS))]>, TB;
+def NEW_CMOVG64rr : RI<0x4F, MRMSrcReg, // if >s, GR64 = GR64
+ (outs GR64:$dst), (ins GR64:$src1, GR64:$src2),
+ "cmovg\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, GR64:$src2,
+ X86_COND_G, EFLAGS))]>, TB;
+def NEW_CMOVG64rm : RI<0x4F, MRMSrcMem, // if >s, GR64 = [mem64]
+ (outs GR64:$dst), (ins GR64:$src1, i64mem:$src2),
+ "cmovg\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, (loadi64 addr:$src2),
+ X86_COND_G, EFLAGS))]>, TB;
+def NEW_CMOVS64rr : RI<0x48, MRMSrcReg, // if signed, GR64 = GR64
+ (outs GR64:$dst), (ins GR64:$src1, GR64:$src2),
+ "cmovs\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, GR64:$src2,
+ X86_COND_S, EFLAGS))]>, TB;
+def NEW_CMOVS64rm : RI<0x48, MRMSrcMem, // if signed, GR64 = [mem64]
+ (outs GR64:$dst), (ins GR64:$src1, i64mem:$src2),
+ "cmovs\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, (loadi64 addr:$src2),
+ X86_COND_S, EFLAGS))]>, TB;
+def NEW_CMOVNS64rr: RI<0x49, MRMSrcReg, // if !signed, GR64 = GR64
+ (outs GR64:$dst), (ins GR64:$src1, GR64:$src2),
+ "cmovns\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, GR64:$src2,
+ X86_COND_NS, EFLAGS))]>, TB;
+def NEW_CMOVNS64rm: RI<0x49, MRMSrcMem, // if !signed, GR64 = [mem64]
+ (outs GR64:$dst), (ins GR64:$src1, i64mem:$src2),
+ "cmovns\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, (loadi64 addr:$src2),
+ X86_COND_NS, EFLAGS))]>, TB;
+def NEW_CMOVP64rr : RI<0x4A, MRMSrcReg, // if parity, GR64 = GR64
+ (outs GR64:$dst), (ins GR64:$src1, GR64:$src2),
+ "cmovp\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, GR64:$src2,
+ X86_COND_P, EFLAGS))]>, TB;
+def NEW_CMOVP64rm : RI<0x4A, MRMSrcMem, // if parity, GR64 = [mem64]
+ (outs GR64:$dst), (ins GR64:$src1, i64mem:$src2),
+ "cmovp\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, (loadi64 addr:$src2),
+ X86_COND_P, EFLAGS))]>, TB;
+def NEW_CMOVNP64rr : RI<0x4B, MRMSrcReg, // if !parity, GR64 = GR64
+ (outs GR64:$dst), (ins GR64:$src1, GR64:$src2),
+ "cmovnp\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, GR64:$src2,
+ X86_COND_NP, EFLAGS))]>, TB;
+def NEW_CMOVNP64rm : RI<0x4B, MRMSrcMem, // if !parity, GR64 = [mem64]
+ (outs GR64:$dst), (ins GR64:$src1, i64mem:$src2),
+ "cmovnp\t{$src2, $dst|$dst, $src2}",
+ [(set GR64:$dst, (X86cmov_new GR64:$src1, (loadi64 addr:$src2),
+ X86_COND_NP, EFLAGS))]>, TB;
} // isTwoAddress
//===----------------------------------------------------------------------===//
@@ -1084,6 +1277,9 @@
def : Pat<(X86cmp GR64:$src1, 0),
(TEST64rr GR64:$src1, GR64:$src1)>;
+def : Pat<(parallel (X86cmp_new GR64:$src1, 0), (implicit EFLAGS)),
+ (NEW_TEST64rr GR64:$src1, GR64:$src1)>;
+
// {s|z}extload bool -> {s|z}extload byte
def : Pat<(sextloadi64i1 addr:$src), (MOVSX64rm8 addr:$src)>;
def : Pat<(zextloadi64i1 addr:$src), (MOVZX64rm8 addr:$src)>;
Modified: llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp?rev=42285&r1=42284&r2=42285&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp (original)
+++ llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp Mon Sep 24 20:57:46 2007
@@ -677,6 +677,30 @@
{ X86::MUL32r, X86::MUL32m },
{ X86::MUL64r, X86::MUL64m },
{ X86::MUL8r, X86::MUL8m },
+
+ // TEMPORARY
+ { X86::NEW_CMP16ri, X86::NEW_CMP16mi },
+ { X86::NEW_CMP16ri8,X86::NEW_CMP16mi8 },
+ { X86::NEW_CMP32ri, X86::NEW_CMP32mi },
+ { X86::NEW_CMP32ri8,X86::NEW_CMP32mi8 },
+ { X86::NEW_CMP64ri32,X86::NEW_CMP64mi32 },
+ { X86::NEW_CMP64ri8,X86::NEW_CMP64mi8 },
+ { X86::NEW_CMP8ri, X86::NEW_CMP8mi },
+ { X86::NEW_SETAEr, X86::NEW_SETAEm },
+ { X86::NEW_SETAr, X86::NEW_SETAm },
+ { X86::NEW_SETBEr, X86::NEW_SETBEm },
+ { X86::NEW_SETBr, X86::NEW_SETBm },
+ { X86::NEW_SETEr, X86::NEW_SETEm },
+ { X86::NEW_SETGEr, X86::NEW_SETGEm },
+ { X86::NEW_SETGr, X86::NEW_SETGm },
+ { X86::NEW_SETLEr, X86::NEW_SETLEm },
+ { X86::NEW_SETLr, X86::NEW_SETLm },
+ { X86::NEW_SETNEr, X86::NEW_SETNEm },
+ { X86::NEW_SETNPr, X86::NEW_SETNPm },
+ { X86::NEW_SETNSr, X86::NEW_SETNSm },
+ { X86::NEW_SETPr, X86::NEW_SETPm },
+ { X86::NEW_SETSr, X86::NEW_SETSm },
+
{ X86::SETAEr, X86::SETAEm },
{ X86::SETAr, X86::SETAm },
{ X86::SETBEr, X86::SETBEm },
@@ -787,6 +811,19 @@
{ X86::MOVZX32rr8, X86::MOVZX32rm8 },
{ X86::MOVZX64rr16, X86::MOVZX64rm16 },
{ X86::MOVZX64rr8, X86::MOVZX64rm8 },
+
+ // TEMPORARY
+ { X86::NEW_Int_COMISDrr, X86::NEW_Int_COMISDrm },
+ { X86::NEW_Int_COMISSrr, X86::NEW_Int_COMISSrm },
+ { X86::NEW_Int_UCOMISDrr, X86::NEW_Int_UCOMISDrm },
+ { X86::NEW_Int_UCOMISSrr, X86::NEW_Int_UCOMISSrm },
+ { X86::NEW_TEST16rr, X86::NEW_TEST16rm },
+ { X86::NEW_TEST32rr, X86::NEW_TEST32rm },
+ { X86::NEW_TEST64rr, X86::NEW_TEST64rm },
+ { X86::NEW_TEST8rr, X86::NEW_TEST8rm },
+ { X86::NEW_UCOMISDrr, X86::NEW_UCOMISDrm },
+ { X86::NEW_UCOMISSrr, X86::NEW_UCOMISSrm },
+
{ X86::PSHUFDri, X86::PSHUFDmi },
{ X86::PSHUFHWri, X86::PSHUFHWmi },
{ X86::PSHUFLWri, X86::PSHUFLWmi },
@@ -920,6 +957,51 @@
{ X86::MULPSrr, X86::MULPSrm },
{ X86::MULSDrr, X86::MULSDrm },
{ X86::MULSSrr, X86::MULSSrm },
+
+ // TEMPORARY
+ { X86::NEW_CMOVA16rr, X86::NEW_CMOVA16rm },
+ { X86::NEW_CMOVA32rr, X86::NEW_CMOVA32rm },
+ { X86::NEW_CMOVA64rr, X86::NEW_CMOVA64rm },
+ { X86::NEW_CMOVAE16rr, X86::NEW_CMOVAE16rm },
+ { X86::NEW_CMOVAE32rr, X86::NEW_CMOVAE32rm },
+ { X86::NEW_CMOVAE64rr, X86::NEW_CMOVAE64rm },
+ { X86::NEW_CMOVB16rr, X86::NEW_CMOVB16rm },
+ { X86::NEW_CMOVB32rr, X86::NEW_CMOVB32rm },
+ { X86::NEW_CMOVB64rr, X86::NEW_CMOVB64rm },
+ { X86::NEW_CMOVBE16rr, X86::NEW_CMOVBE16rm },
+ { X86::NEW_CMOVBE32rr, X86::NEW_CMOVBE32rm },
+ { X86::NEW_CMOVBE64rr, X86::NEW_CMOVBE64rm },
+ { X86::NEW_CMOVE16rr, X86::NEW_CMOVE16rm },
+ { X86::NEW_CMOVE32rr, X86::NEW_CMOVE32rm },
+ { X86::NEW_CMOVE64rr, X86::NEW_CMOVE64rm },
+ { X86::NEW_CMOVG16rr, X86::NEW_CMOVG16rm },
+ { X86::NEW_CMOVG32rr, X86::NEW_CMOVG32rm },
+ { X86::NEW_CMOVG64rr, X86::NEW_CMOVG64rm },
+ { X86::NEW_CMOVGE16rr, X86::NEW_CMOVGE16rm },
+ { X86::NEW_CMOVGE32rr, X86::NEW_CMOVGE32rm },
+ { X86::NEW_CMOVGE64rr, X86::NEW_CMOVGE64rm },
+ { X86::NEW_CMOVL16rr, X86::NEW_CMOVL16rm },
+ { X86::NEW_CMOVL32rr, X86::NEW_CMOVL32rm },
+ { X86::NEW_CMOVL64rr, X86::NEW_CMOVL64rm },
+ { X86::NEW_CMOVLE16rr, X86::NEW_CMOVLE16rm },
+ { X86::NEW_CMOVLE32rr, X86::NEW_CMOVLE32rm },
+ { X86::NEW_CMOVLE64rr, X86::NEW_CMOVLE64rm },
+ { X86::NEW_CMOVNE16rr, X86::NEW_CMOVNE16rm },
+ { X86::NEW_CMOVNE32rr, X86::NEW_CMOVNE32rm },
+ { X86::NEW_CMOVNE64rr, X86::NEW_CMOVNE64rm },
+ { X86::NEW_CMOVNP16rr, X86::NEW_CMOVNP16rm },
+ { X86::NEW_CMOVNP32rr, X86::NEW_CMOVNP32rm },
+ { X86::NEW_CMOVNP64rr, X86::NEW_CMOVNP64rm },
+ { X86::NEW_CMOVNS16rr, X86::NEW_CMOVNS16rm },
+ { X86::NEW_CMOVNS32rr, X86::NEW_CMOVNS32rm },
+ { X86::NEW_CMOVNS64rr, X86::NEW_CMOVNS64rm },
+ { X86::NEW_CMOVP16rr, X86::NEW_CMOVP16rm },
+ { X86::NEW_CMOVP32rr, X86::NEW_CMOVP32rm },
+ { X86::NEW_CMOVP64rr, X86::NEW_CMOVP64rm },
+ { X86::NEW_CMOVS16rr, X86::NEW_CMOVS16rm },
+ { X86::NEW_CMOVS32rr, X86::NEW_CMOVS32rm },
+ { X86::NEW_CMOVS64rr, X86::NEW_CMOVS64rm },
+
{ X86::OR16rr, X86::OR16rm },
{ X86::OR32rr, X86::OR32rm },
{ X86::OR64rr, X86::OR64rm },
From resistor at mac.com Mon Sep 24 19:10:58 2007
From: resistor at mac.com (Owen Anderson)
Date: Tue, 25 Sep 2007 02:10:58 -0000
Subject: [llvm-commits] [llvm] r42286 - /llvm/trunk/docs/ReleaseNotes.html
Message-ID: <200709250210.l8P2AwXN009437@zion.cs.uiuc.edu>
Author: resistor
Date: Mon Sep 24 21:10:58 2007
New Revision: 42286
URL: http://llvm.org/viewvc/llvm-project?rev=42286&view=rev
Log:
Fill in the sections about my contributions.
Modified:
llvm/trunk/docs/ReleaseNotes.html
Modified: llvm/trunk/docs/ReleaseNotes.html
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/ReleaseNotes.html?rev=42286&r1=42285&r2=42286&view=diff
==============================================================================
--- llvm/trunk/docs/ReleaseNotes.html (original)
+++ llvm/trunk/docs/ReleaseNotes.html Mon Sep 24 21:10:58 2007
@@ -98,9 +98,17 @@
-
Owen DSE and MemDep analysis
-
Owen GVN
-
Owen GVN-PRE, not in llvm-gcc
+
Owen Anderson wrote the new MemoryDependenceAnalysis pass, which provides
+ a lazy, caching layer on top of alias analysis. He then used it to rewrite
+ DeadStoreElimination which resulted in significantly better compile time in
+ common cases,
+
Owen implemented the new GVN pass, which is also based on
+ MemoryDependenceAnalysis. This pass replaces GCSE/LoadVN in the standard
+ set of passes, providing more aggressive optimization at a some-what
+ improved compile-time cost.
+
Owen implemented GVN-PRE, a partial redundancy elimination algorithm that
+ shares some details with the new GVN pass. It is still in need of compile
+ time tuning, and is not turned on by default.
Devang merged ETForest and DomTree into a single easier to use data
structure.
Nick Lewycky improved loop trip count analysis to handle many more common
From isanbard at gmail.com Mon Sep 24 19:51:18 2007
From: isanbard at gmail.com (Bill Wendling)
Date: Tue, 25 Sep 2007 02:51:18 -0000
Subject: [llvm-commits] [llvm-gcc-4.0] r42287 - in /llvm-gcc-4.0/trunk/gcc:
c-common.h llvm-backend.cpp objc/objc-act.c stub-objc.c
Message-ID: <200709250251.l8P2pI2f010914@zion.cs.uiuc.edu>
Author: void
Date: Mon Sep 24 21:51:18 2007
New Revision: 42287
URL: http://llvm.org/viewvc/llvm-project?rev=42287&view=rev
Log:
During the processing of Objective-C "protocols", the objc frontend creates two
decls for the protocol. One for the metadata and another for when it's
referenced. However, protocols are "internal global", so when we do a lookup
for the reference, it doesn't find the first decl because it's not "external".
This will perform a second lookup for objective C protocols if we don't find
it among the "external globals".
Modified:
llvm-gcc-4.0/trunk/gcc/c-common.h
llvm-gcc-4.0/trunk/gcc/llvm-backend.cpp
llvm-gcc-4.0/trunk/gcc/objc/objc-act.c
llvm-gcc-4.0/trunk/gcc/stub-objc.c
Modified: llvm-gcc-4.0/trunk/gcc/c-common.h
URL: http://llvm.org/viewvc/llvm-project/llvm-gcc-4.0/trunk/gcc/c-common.h?rev=42287&r1=42286&r2=42287&view=diff
==============================================================================
--- llvm-gcc-4.0/trunk/gcc/c-common.h (original)
+++ llvm-gcc-4.0/trunk/gcc/c-common.h Mon Sep 24 21:51:18 2007
@@ -1139,6 +1139,12 @@
extern void objc_remove_weak_read (tree*);
/* APPLE LOCAL end radar 4426814 */
+/* APPLE LOCAL begin - LLVM radar 5476262 */
+#ifdef ENABLE_LLVM
+extern bool objc_is_protocol_reference (const char *name);
+#endif
+/* APPLE LOCAL end - LLVM radar 5476262 */
+
/* APPLE LOCAL begin C* language */
extern void objc_set_method_opt (int);
void objc_finish_foreach_loop (location_t, tree, tree, tree, tree);
Modified: llvm-gcc-4.0/trunk/gcc/llvm-backend.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm-gcc-4.0/trunk/gcc/llvm-backend.cpp?rev=42287&r1=42286&r2=42287&view=diff
==============================================================================
--- llvm-gcc-4.0/trunk/gcc/llvm-backend.cpp (original)
+++ llvm-gcc-4.0/trunk/gcc/llvm-backend.cpp Mon Sep 24 21:51:18 2007
@@ -66,6 +66,7 @@
#include "function.h"
#include "tree-inline.h"
#include "langhooks.h"
+#include "c-common.h"
}
// Non-zero if bytecode from PCH is successfully read.
@@ -1060,6 +1061,14 @@
// If the global has a name, prevent multiple vars with the same name from
// being created.
GlobalVariable *GVE = TheModule->getGlobalVariable(Name);
+
+ // And Objective-C "@protocol" will create a decl for the
+ // protocol metadata and then when the protocol is
+ // referenced. However, protocols have file-scope, so they
+ // aren't found in the GlobalVariable list unless we look at
+ // non-extern globals as well.
+ if (!GVE && c_dialect_objc() && objc_is_protocol_reference(Name))
+ GVE = TheModule->getGlobalVariable(Name, true);
if (GVE == 0) {
GV = new GlobalVariable(Ty, false, GlobalValue::ExternalLinkage,0,
Modified: llvm-gcc-4.0/trunk/gcc/objc/objc-act.c
URL: http://llvm.org/viewvc/llvm-project/llvm-gcc-4.0/trunk/gcc/objc/objc-act.c?rev=42287&r1=42286&r2=42287&view=diff
==============================================================================
--- llvm-gcc-4.0/trunk/gcc/objc/objc-act.c (original)
+++ llvm-gcc-4.0/trunk/gcc/objc/objc-act.c Mon Sep 24 21:51:18 2007
@@ -11669,6 +11669,19 @@
}
}
+/* APPLE LOCAL begin - LLVM radar 5476262 */
+#ifdef ENABLE_LLVM
+/* This routine returns true if the name is the same as a protocol
+ reference name. */
+
+bool
+objc_is_protocol_reference (const char *name)
+{
+ return flag_objc_abi == 2 && strstr (name, "_OBJC_PROTOCOL_$_") != 0;
+}
+#endif
+/* APPLE LOCAL end - LLVM radar 5476262 */
+
/* This routine builds the protocol_reference_chain for each protocol name used
@protocol(MyProtocol) expression. IDENT is current protocol name. */
Modified: llvm-gcc-4.0/trunk/gcc/stub-objc.c
URL: http://llvm.org/viewvc/llvm-project/llvm-gcc-4.0/trunk/gcc/stub-objc.c?rev=42287&r1=42286&r2=42287&view=diff
==============================================================================
--- llvm-gcc-4.0/trunk/gcc/stub-objc.c (original)
+++ llvm-gcc-4.0/trunk/gcc/stub-objc.c Mon Sep 24 21:51:18 2007
@@ -548,3 +548,13 @@
return false;
}
/* APPLE LOCAL end radar 4985544 */
+
+/* APPLE LOCAL begin - LLVM radar 5476262 */
+#ifdef ENABLE_LLVM
+bool
+objc_is_protocol_reference (const char * ARG_UNUSED(name))
+{
+ return false;
+}
+#endif
+/* APPLE LOCAL end - LLVM radar 5476262 */
From isanbard at gmail.com Mon Sep 24 20:18:24 2007
From: isanbard at gmail.com (Bill Wendling)
Date: Tue, 25 Sep 2007 03:18:24 -0000
Subject: [llvm-commits] [llvm-gcc-4.0] r42288 -
/llvm-gcc-4.0/trunk/gcc/llvm-backend.cpp
Message-ID: <200709250318.l8P3IOCu012055@zion.cs.uiuc.edu>
Author: void
Date: Mon Sep 24 22:18:23 2007
New Revision: 42288
URL: http://llvm.org/viewvc/llvm-project?rev=42288&view=rev
Log:
Tabs are the devil's work!
Modified:
llvm-gcc-4.0/trunk/gcc/llvm-backend.cpp
Modified: llvm-gcc-4.0/trunk/gcc/llvm-backend.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm-gcc-4.0/trunk/gcc/llvm-backend.cpp?rev=42288&r1=42287&r2=42288&view=diff
==============================================================================
--- llvm-gcc-4.0/trunk/gcc/llvm-backend.cpp (original)
+++ llvm-gcc-4.0/trunk/gcc/llvm-backend.cpp Mon Sep 24 22:18:23 2007
@@ -1060,7 +1060,7 @@
} else {
// If the global has a name, prevent multiple vars with the same name from
// being created.
- GlobalVariable *GVE = TheModule->getGlobalVariable(Name);
+ GlobalVariable *GVE = TheModule->getGlobalVariable(Name, true);
// And Objective-C "@protocol" will create a decl for the
// protocol metadata and then when the protocol is
@@ -1068,7 +1068,7 @@
// aren't found in the GlobalVariable list unless we look at
// non-extern globals as well.
if (!GVE && c_dialect_objc() && objc_is_protocol_reference(Name))
- GVE = TheModule->getGlobalVariable(Name, true);
+ GVE = TheModule->getGlobalVariable(Name, true);
if (GVE == 0) {
GV = new GlobalVariable(Ty, false, GlobalValue::ExternalLinkage,0,
From isanbard at gmail.com Mon Sep 24 20:21:18 2007
From: isanbard at gmail.com (Bill Wendling)
Date: Tue, 25 Sep 2007 03:21:18 -0000
Subject: [llvm-commits] [llvm-gcc-4.0] r42289 -
/llvm-gcc-4.0/trunk/gcc/llvm-backend.cpp
Message-ID: <200709250321.l8P3LI1D012206@zion.cs.uiuc.edu>
Author: void
Date: Mon Sep 24 22:21:18 2007
New Revision: 42289
URL: http://llvm.org/viewvc/llvm-project?rev=42289&view=rev
Log:
Oops!
Modified:
llvm-gcc-4.0/trunk/gcc/llvm-backend.cpp
Modified: llvm-gcc-4.0/trunk/gcc/llvm-backend.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm-gcc-4.0/trunk/gcc/llvm-backend.cpp?rev=42289&r1=42288&r2=42289&view=diff
==============================================================================
--- llvm-gcc-4.0/trunk/gcc/llvm-backend.cpp (original)
+++ llvm-gcc-4.0/trunk/gcc/llvm-backend.cpp Mon Sep 24 22:21:18 2007
@@ -1060,7 +1060,7 @@
} else {
// If the global has a name, prevent multiple vars with the same name from
// being created.
- GlobalVariable *GVE = TheModule->getGlobalVariable(Name, true);
+ GlobalVariable *GVE = TheModule->getGlobalVariable(Name);
// And Objective-C "@protocol" will create a decl for the
// protocol metadata and then when the protocol is
From clattner at apple.com Mon Sep 24 22:32:37 2007
From: clattner at apple.com (Chris Lattner)
Date: Mon, 24 Sep 2007 22:32:37 -0700
Subject: [llvm-commits] [llvm] r42278 -
/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
In-Reply-To: <200709242154.l8OLsF56000956@zion.cs.uiuc.edu>
References: <200709242154.l8OLsF56000956@zion.cs.uiuc.edu>
Message-ID: <5F679400-DF52-4259-B01F-9448A3605697@apple.com>
> URL: http://llvm.org/viewvc/llvm-project?rev=42278&view=rev
> Log:
> Add support on x86 for having Legalize lower ISD::LOCATION to
> ISD::DEBUG_LOC
> instead of ISD::LABEL with a manual .debug_line entry when the
> assembler
> supports .file and .loc directives.
Nice!
>
> - // We don't have line number support yet.
> + // Use the default ISD::LOCATION expansion, and tell Legalize it's
> + // ok to use DEBUG_LOC if we have an assembler that supports it.
> setOperationAction(ISD::LOCATION, MVT::Other, Expand);
> - setOperationAction(ISD::DEBUG_LOC, MVT::Other, Expand);
> + if (TM.getTargetAsmInfo()->hasDotLocAndDotFile())
> + setOperationAction(ISD::DEBUG_LOC, MVT::Other, Legal);
> + else
> + setOperationAction(ISD::DEBUG_LOC, MVT::Other, Expand);
Would it make sense to move this to a superclass? It doesn't seem
like any other definition would make sense.
-chris
From baldrick at free.fr Tue Sep 25 00:10:02 2007
From: baldrick at free.fr (Duncan Sands)
Date: Tue, 25 Sep 2007 07:10:02 -0000
Subject: [llvm-commits] [llvm] r42294 - in /llvm/trunk/lib/VMCore:
DominatorCalculation.h DominatorInternals.cpp
Message-ID: <200709250710.l8P7A28b021647@zion.cs.uiuc.edu>
Author: baldrick
Date: Tue Sep 25 02:10:01 2007
New Revision: 42294
URL: http://llvm.org/viewvc/llvm-project?rev=42294&view=rev
Log:
Add missing end-of-file newlines.
Modified:
llvm/trunk/lib/VMCore/DominatorCalculation.h
llvm/trunk/lib/VMCore/DominatorInternals.cpp
Modified: llvm/trunk/lib/VMCore/DominatorCalculation.h
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/DominatorCalculation.h?rev=42294&r1=42293&r2=42294&view=diff
==============================================================================
--- llvm/trunk/lib/VMCore/DominatorCalculation.h (original)
+++ llvm/trunk/lib/VMCore/DominatorCalculation.h Tue Sep 25 02:10:01 2007
@@ -104,4 +104,4 @@
}
}
-#endif
\ No newline at end of file
+#endif
Modified: llvm/trunk/lib/VMCore/DominatorInternals.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/DominatorInternals.cpp?rev=42294&r1=42293&r2=42294&view=diff
==============================================================================
--- llvm/trunk/lib/VMCore/DominatorInternals.cpp (original)
+++ llvm/trunk/lib/VMCore/DominatorInternals.cpp Tue Sep 25 02:10:01 2007
@@ -142,4 +142,4 @@
}
-#endif
\ No newline at end of file
+#endif
From baldrick at free.fr Tue Sep 25 06:53:22 2007
From: baldrick at free.fr (Duncan Sands)
Date: Tue, 25 Sep 2007 13:53:22 -0000
Subject: [llvm-commits] [llvm] r42296 - /llvm/trunk/include/llvm/Type.h
Message-ID: <200709251353.l8PDrMh3010932@zion.cs.uiuc.edu>
Author: baldrick
Date: Tue Sep 25 08:53:22 2007
New Revision: 42296
URL: http://llvm.org/viewvc/llvm-project?rev=42296&view=rev
Log:
Fix spelling.
Modified:
llvm/trunk/include/llvm/Type.h
Modified: llvm/trunk/include/llvm/Type.h
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Type.h?rev=42296&r1=42295&r2=42296&view=diff
==============================================================================
--- llvm/trunk/include/llvm/Type.h (original)
+++ llvm/trunk/include/llvm/Type.h Tue Sep 25 08:53:22 2007
@@ -243,7 +243,7 @@
///
unsigned getPrimitiveSizeInBits() const;
- /// getForwaredType - Return the type that this type has been resolved to if
+ /// getForwardedType - Return the type that this type has been resolved to if
/// it has been resolved to anything. This is used to implement the
/// union-find algorithm for type resolution, and shouldn't be used by general
/// purpose clients.
From djg at cray.com Tue Sep 25 08:10:50 2007
From: djg at cray.com (Dan Gohman)
Date: Tue, 25 Sep 2007 15:10:50 -0000
Subject: [llvm-commits] [llvm] r42297 - in /llvm/trunk/lib:
CodeGen/SelectionDAG/TargetLowering.cpp Target/X86/X86ISelLowering.cpp
Message-ID: <200709251510.l8PFAohE013856@zion.cs.uiuc.edu>
Author: djg
Date: Tue Sep 25 10:10:49 2007
New Revision: 42297
URL: http://llvm.org/viewvc/llvm-project?rev=42297&view=rev
Log:
Move the setOperationAction(ISD::DEBUG_LOC, MVT::Other, Expand) and
the check to see if the assembler supports .loc from X86TargetLowering
into the superclass TargetLowering.
Modified:
llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp
llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
Modified: llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp?rev=42297&r1=42296&r2=42297&view=diff
==============================================================================
--- llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp (original)
+++ llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp Tue Sep 25 10:10:49 2007
@@ -20,6 +20,7 @@
#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/Support/MathExtras.h"
+#include "llvm/Target/TargetAsmInfo.h"
using namespace llvm;
/// InitLibcallNames - Set default libcall names.
@@ -169,6 +170,10 @@
InitLibcallNames(LibcallRoutineNames);
InitCmpLibcallCCs(CmpLibcallCCs);
+
+ // Tell Legalize whether the assembler supports DEBUG_LOC.
+ if (!TM.getTargetAsmInfo()->hasDotLocAndDotFile())
+ setOperationAction(ISD::DEBUG_LOC, MVT::Other, Expand);
}
TargetLowering::~TargetLowering() {}
Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=42297&r1=42296&r2=42297&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)
+++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Tue Sep 25 10:10:49 2007
@@ -33,7 +33,6 @@
#include "llvm/CodeGen/SSARegMap.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/MathExtras.h"
-#include "llvm/Target/TargetAsmInfo.h"
#include "llvm/Target/TargetOptions.h"
#include "llvm/ADT/StringExtras.h"
#include "llvm/ParameterAttributes.h"
@@ -231,13 +230,8 @@
setOperationAction(ISD::MEMSET , MVT::Other, Custom);
setOperationAction(ISD::MEMCPY , MVT::Other, Custom);
- // Use the default ISD::LOCATION expansion, and tell Legalize it's
- // ok to use DEBUG_LOC if we have an assembler that supports it.
+ // Use the default ISD::LOCATION expansion.
setOperationAction(ISD::LOCATION, MVT::Other, Expand);
- if (TM.getTargetAsmInfo()->hasDotLocAndDotFile())
- setOperationAction(ISD::DEBUG_LOC, MVT::Other, Legal);
- else
- setOperationAction(ISD::DEBUG_LOC, MVT::Other, Expand);
// FIXME - use subtarget debug flags
if (!Subtarget->isTargetDarwin() &&
!Subtarget->isTargetELF() &&
From clattner at apple.com Tue Sep 25 09:35:36 2007
From: clattner at apple.com (Chris Lattner)
Date: Tue, 25 Sep 2007 09:35:36 -0700
Subject: [llvm-commits] Duncan's VIEW_CONVERT_EXPR bug
Message-ID:
Duncan, please try this patch. If it helps your Ada issue, please
apply it to 4.0 and 4.2.
-Chris
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vce.patch
Type: application/octet-stream
Size: 1294 bytes
Desc: not available
URL:
-------------- next part --------------
From djg at cray.com Tue Sep 25 10:22:26 2007
From: djg at cray.com (Dan Gohman)
Date: Tue, 25 Sep 2007 12:22:26 -0500
Subject: [llvm-commits] [llvm] r42285 - in /llvm/trunk/lib/Target/X86:
X86FloatingPoint.cpp X86ISelLowering.cpp X86ISelLowering.h
X86InstrFPStack.td X86InstrInfo.cpp X86InstrInfo.td
X86InstrSSE.td X86InstrX86-64.td X86RegisterInfo.cpp
Message-ID: <20070925172226.GB21991@village.us.cray.com>
> +def X86cmp_new : SDNode<"X86ISD::CMP_NEW" , SDTX86CmpTest>;
X86ISD::CMP_NEW SelectionDAG nodes return an i32 flags value, however the
corresponding X86cmp_new pattern-matching nodes use SDTX86CmpTest, which has
no result values. Is this intentional? If so, how is the discrepancy handled?
> +def : Pat<(parallel (X86cmp_new GR8:$src1, 0), (implicit EFLAGS)),
> + (NEW_TEST8rr GR8:$src1, GR8:$src1)>;
In the SelectionDAG IR, an SDNode can return multiple results. However, in
this GCC-RTL-like pattern langauge, where many things are supposed to directly
correspond to SelectionDAG counterparts, nodes can return at most one result.
They must be grouped together in a parallel to represent operations that have
multiple results. It seems like this will result in more discrepancies. Am I
misunderstanding something?
Dan
--
Dan Gohman, Cray Inc.
From dalej at apple.com Tue Sep 25 10:25:01 2007
From: dalej at apple.com (Dale Johannesen)
Date: Tue, 25 Sep 2007 17:25:01 -0000
Subject: [llvm-commits] [llvm] r42298 - /llvm/trunk/lib/Support/APFloat.cpp
Message-ID: <200709251725.l8PHP7nS019168@zion.cs.uiuc.edu>
Author: johannes
Date: Tue Sep 25 12:25:00 2007
New Revision: 42298
URL: http://llvm.org/viewvc/llvm-project?rev=42298&view=rev
Log:
Fix long double<->shorter FP type conversions
of zero, infinity, and NaNs.
Modified:
llvm/trunk/lib/Support/APFloat.cpp
Modified: llvm/trunk/lib/Support/APFloat.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Support/APFloat.cpp?rev=42298&r1=42297&r2=42298&view=diff
==============================================================================
--- llvm/trunk/lib/Support/APFloat.cpp (original)
+++ llvm/trunk/lib/Support/APFloat.cpp Tue Sep 25 12:25:00 2007
@@ -1329,26 +1329,30 @@
/* Handle storage complications. If our new form is wider,
re-allocate our bit pattern into wider storage. If it is
narrower, we ignore the excess parts, but if narrowing to a
- single part we need to free the old storage. */
+ single part we need to free the old storage.
+ Be careful not to reference significandParts for zeroes
+ and infinities, since it aborts. */
if (newPartCount > oldPartCount) {
integerPart *newParts;
-
newParts = new integerPart[newPartCount];
APInt::tcSet(newParts, 0, newPartCount);
- APInt::tcAssign(newParts, significandParts(), oldPartCount);
+ if (category==fcNormal || category==fcNaN)
+ APInt::tcAssign(newParts, significandParts(), oldPartCount);
freeSignificand();
significand.parts = newParts;
} else if (newPartCount < oldPartCount) {
/* Capture any lost fraction through truncation of parts so we get
correct rounding whilst normalizing. */
- lostFraction = lostFractionThroughTruncation
- (significandParts(), oldPartCount, toSemantics.precision);
- if (newPartCount == 1)
- {
- integerPart newPart = significandParts()[0];
- freeSignificand();
- significand.part = newPart;
- }
+ if (category==fcNormal)
+ lostFraction = lostFractionThroughTruncation
+ (significandParts(), oldPartCount, toSemantics.precision);
+ if (newPartCount == 1) {
+ integerPart newPart = 0;
+ if (category==fcNormal || category==fcNaN)
+ newPart = significandParts()[0];
+ freeSignificand();
+ significand.part = newPart;
+ }
}
if(category == fcNormal) {
@@ -1356,6 +1360,19 @@
exponent += toSemantics.precision - semantics->precision;
semantics = &toSemantics;
fs = normalize(rounding_mode, lostFraction);
+ } else if (category == fcNaN) {
+ int shift = toSemantics.precision - semantics->precision;
+ // No normalization here, just truncate
+ if (shift>0)
+ APInt::tcShiftLeft(significandParts(), newPartCount, shift);
+ else if (shift < 0)
+ APInt::tcShiftRight(significandParts(), newPartCount, -shift);
+ // gcc forces the Quiet bit on, which means (float)(double)(float_sNan)
+ // does not give you back the same bits. This is dubious, and we
+ // don't currently do it. You're really supposed to get
+ // an invalid operation signal at runtime, but nobody does that.
+ semantics = &toSemantics;
+ fs = opOK;
} else {
semantics = &toSemantics;
fs = opOK;
@@ -1818,7 +1835,7 @@
} else if (myexponent==0xff && mysignificand==0) {
// exponent, significand meaningless
category = fcInfinity;
- } else if (myexponent==0xff && (mysignificand & 0x400000)) {
+ } else if (myexponent==0xff && mysignificand!=0) {
// sign, exponent, significand meaningless
category = fcNaN;
*significandParts() = mysignificand;
From dpatel at apple.com Tue Sep 25 10:31:20 2007
From: dpatel at apple.com (Devang Patel)
Date: Tue, 25 Sep 2007 17:31:20 -0000
Subject: [llvm-commits] [llvm] r42299 - in /llvm/trunk:
lib/Transforms/Scalar/LoopIndexSplit.cpp
test/Transforms/LoopIndexSplit/2007-09-24-UpdateIterationSpace.ll
Message-ID: <200709251731.l8PHVMjT019539@zion.cs.uiuc.edu>
Author: dpatel
Date: Tue Sep 25 12:31:19 2007
New Revision: 42299
URL: http://llvm.org/viewvc/llvm-project?rev=42299&view=rev
Log:
Add transformation to update loop interation space. Now,
for (i=A; i Y)
do_something();
}
is transformed into
U=min(N,X); L=max(A,Y);
for (i=L;igetPredicate() == ICmpInst::ICMP_SGT
+ || ExitCondition->getPredicate() == ICmpInst::ICMP_UGT
+ || ExitCondition->getPredicate() == ICmpInst::ICMP_SGE
+ || ExitCondition->getPredicate() == ICmpInst::ICMP_UGE) {
+ ExitCondition->swapOperands();
+ if (ExitValueNum)
+ ExitValueNum = 0;
+ else
+ ExitValueNum = 1;
+ }
+
+ Value *NUB = NULL;
+ Value *NLB = NULL;
+ Value *UB = ExitCondition->getOperand(ExitValueNum);
+ const Type *Ty = NV->getType();
+ bool Sign = ExitCondition->isSignedPredicate();
+ BasicBlock *Preheader = L->getLoopPreheader();
+ Instruction *PHTerminator = Preheader->getTerminator();
+
+ assert (NV && "Unexpected value");
+
switch (CI->getPredicate()) {
case ICmpInst::ICMP_ULE:
case ICmpInst::ICMP_SLE:
@@ -740,9 +761,15 @@
// for (i = LB; i < NUB ; ++i)
// LOOP_BODY
//
-
-
-
+ if (ExitCondition->getPredicate() == ICmpInst::ICMP_SLT
+ || ExitCondition->getPredicate() == ICmpInst::ICMP_ULT) {
+ Value *A = BinaryOperator::createAdd(NV, ConstantInt::get(Ty, 1, Sign),
+ "lsplit.add", PHTerminator);
+ Value *C = new ICmpInst(Sign ? ICmpInst::ICMP_SLT : ICmpInst::ICMP_ULT,
+ A, UB,"lsplit,c", PHTerminator);
+ NUB = new SelectInst (C, A, UB, "lsplit.nub", PHTerminator);
+ }
+
// for (i = LB; i <= UB; ++i)
// if (i <= NV && ...)
// LOOP_BODY
@@ -752,6 +779,12 @@
// for (i = LB; i <= NUB ; ++i)
// LOOP_BODY
//
+ else if (ExitCondition->getPredicate() == ICmpInst::ICMP_SLE
+ || ExitCondition->getPredicate() == ICmpInst::ICMP_ULE) {
+ Value *C = new ICmpInst(Sign ? ICmpInst::ICMP_SLT : ICmpInst::ICMP_ULT,
+ NV, UB, "lsplit.c", PHTerminator);
+ NUB = new SelectInst (C, NV, UB, "lsplit.nub", PHTerminator);
+ }
break;
case ICmpInst::ICMP_ULT:
case ICmpInst::ICMP_SLT:
@@ -764,8 +797,12 @@
// for (i = LB; i < NUB ; ++i)
// LOOP_BODY
//
-
-
+ if (ExitCondition->getPredicate() == ICmpInst::ICMP_SLT
+ || ExitCondition->getPredicate() == ICmpInst::ICMP_ULT) {
+ Value *C = new ICmpInst(Sign ? ICmpInst::ICMP_SLT : ICmpInst::ICMP_ULT,
+ NV, UB, "lsplit.c", PHTerminator);
+ NUB = new SelectInst (C, NV, UB, "lsplit.nub", PHTerminator);
+ }
// for (i = LB; i <= UB; ++i)
// if (i < NV && ...)
@@ -776,6 +813,14 @@
// for (i = LB; i <= NUB ; ++i)
// LOOP_BODY
//
+ else if (ExitCondition->getPredicate() == ICmpInst::ICMP_SLE
+ || ExitCondition->getPredicate() == ICmpInst::ICMP_ULE) {
+ Value *S = BinaryOperator::createSub(NV, ConstantInt::get(Ty, 1, Sign),
+ "lsplit.add", PHTerminator);
+ Value *C = new ICmpInst(Sign ? ICmpInst::ICMP_SLT : ICmpInst::ICMP_ULT,
+ S, UB, "lsplit.c", PHTerminator);
+ NUB = new SelectInst (C, S, UB, "lsplit.nub", PHTerminator);
+ }
break;
case ICmpInst::ICMP_UGE:
case ICmpInst::ICMP_SGE:
@@ -788,6 +833,11 @@
// for (i = NLB; i (< or <=) UB ; ++i)
// LOOP_BODY
//
+ {
+ Value *C = new ICmpInst(Sign ? ICmpInst::ICMP_SLT : ICmpInst::ICMP_ULT,
+ NV, StartValue, "lsplit.c", PHTerminator);
+ NLB = new SelectInst (C, StartValue, NV, "lsplit.nlb", PHTerminator);
+ }
break;
case ICmpInst::ICMP_UGT:
case ICmpInst::ICMP_SGT:
@@ -800,10 +850,26 @@
// for (i = NLB; i (< or <=) UB ; ++i)
// LOOP_BODY
//
+ {
+ Value *A = BinaryOperator::createAdd(NV, ConstantInt::get(Ty, 1, Sign),
+ "lsplit.add", PHTerminator);
+ Value *C = new ICmpInst(Sign ? ICmpInst::ICMP_SLT : ICmpInst::ICMP_ULT,
+ A, StartValue, "lsplit.c", PHTerminator);
+ NLB = new SelectInst (C, StartValue, A, "lsplit.nlb", PHTerminator);
+ }
break;
default:
assert ( 0 && "Unexpected split condition predicate");
}
+
+ if (NLB) {
+ unsigned i = IndVar->getBasicBlockIndex(Preheader);
+ IndVar->setIncomingValue(i, NLB);
+ }
+
+ if (NUB) {
+ ExitCondition->setOperand(ExitValueNum, NUB);
+ }
}
/// updateLoopIterationSpace - Current loop body is covered by an AND
/// instruction whose operands compares induction variables with loop
@@ -811,6 +877,9 @@
/// updating appropriate start and end values for induction variable.
bool LoopIndexSplit::updateLoopIterationSpace(SplitInfo &SD) {
BasicBlock *Header = L->getHeader();
+ BasicBlock *ExitingBlock = ExitCondition->getParent();
+ BasicBlock *SplitCondBlock = SD.SplitCondition->getParent();
+
ICmpInst *Op0 = cast(SD.SplitCondition->getOperand(0));
ICmpInst *Op1 = cast(SD.SplitCondition->getOperand(1));
@@ -865,11 +934,83 @@
// loop may not be eliminated.
if (!safeExitingBlock(SD, ExitCondition->getParent()))
return false;
-
+
+ // Verify that loop exiting block has only two predecessor, where one predecessor
+ // is split condition block. The other predecessor will become exiting block's
+ // dominator after CFG is updated. TODO : Handle CFG's where exiting block has
+ // more then two predecessors. This requires extra work in updating dominator
+ // information.
+ BasicBlock *ExitingBBPred = NULL;
+ for (pred_iterator PI = pred_begin(ExitingBlock), PE = pred_end(ExitingBlock);
+ PI != PE; ++PI) {
+ BasicBlock *BB = *PI;
+ if (SplitCondBlock == BB)
+ continue;
+ if (ExitingBBPred)
+ return false;
+ else
+ ExitingBBPred = BB;
+ }
+
+ // Update loop bounds to absorb Op0 check.
updateLoopBounds(Op0);
+ // Update loop bounds to absorb Op1 check.
updateLoopBounds(Op1);
+
// Update CFG
- return false;
+
+ // Unconditionally connect split block to its remaining successor.
+ BranchInst *SplitTerminator =
+ cast(SplitCondBlock->getTerminator());
+ BasicBlock *Succ0 = SplitTerminator->getSuccessor(0);
+ BasicBlock *Succ1 = SplitTerminator->getSuccessor(1);
+ if (Succ0 == ExitCondition->getParent())
+ SplitTerminator->setUnconditionalDest(Succ1);
+ else
+ SplitTerminator->setUnconditionalDest(Succ0);
+
+ // Remove split condition.
+ SD.SplitCondition->eraseFromParent();
+ if (Op0->use_begin() == Op0->use_end())
+ Op0->eraseFromParent();
+ if (Op1->use_begin() == Op1->use_end())
+ Op1->eraseFromParent();
+
+ BranchInst *ExitInsn =
+ dyn_cast(ExitingBlock->getTerminator());
+ assert (ExitInsn && "Unable to find suitable loop exit branch");
+ BasicBlock *ExitBlock = ExitInsn->getSuccessor(1);
+ if (L->contains(ExitBlock))
+ ExitBlock = ExitInsn->getSuccessor(0);
+
+ // Update domiantor info. Now, ExitingBlock has only one predecessor,
+ // ExitingBBPred, and it is ExitingBlock's immediate domiantor.
+ DT->changeImmediateDominator(ExitingBlock, ExitingBBPred);
+
+ // If ExitingBlock is a member of loop BB's DF list then replace it with
+ // loop header and exit block.
+ for (Loop::block_iterator I = L->block_begin(), E = L->block_end();
+ I != E; ++I) {
+ BasicBlock *BB = *I;
+ if (BB == Header || BB == ExitingBlock)
+ continue;
+ DominanceFrontier::iterator BBDF = DF->find(BB);
+ DominanceFrontier::DomSetType::iterator DomSetI = BBDF->second.begin();
+ DominanceFrontier::DomSetType::iterator DomSetE = BBDF->second.end();
+ while (DomSetI != DomSetE) {
+ DominanceFrontier::DomSetType::iterator CurrentItr = DomSetI;
+ ++DomSetI;
+ BasicBlock *DFBB = *CurrentItr;
+ if (DFBB == ExitingBlock) {
+ BBDF->second.erase(DFBB);
+ BBDF->second.insert(Header);
+ if (Header != ExitingBlock)
+ BBDF->second.insert(ExitBlock);
+ }
+ }
+ }
+
+ return return;
}
Added: llvm/trunk/test/Transforms/LoopIndexSplit/2007-09-24-UpdateIterationSpace.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopIndexSplit/2007-09-24-UpdateIterationSpace.ll?rev=42299&view=auto
==============================================================================
--- llvm/trunk/test/Transforms/LoopIndexSplit/2007-09-24-UpdateIterationSpace.ll (added)
+++ llvm/trunk/test/Transforms/LoopIndexSplit/2007-09-24-UpdateIterationSpace.ll Tue Sep 25 12:31:19 2007
@@ -0,0 +1,57 @@
+
+; Update loop iteraton space to eliminate condition inside loop.
+; RUN: llvm-as < %s | opt -loop-index-split | llvm-dis | not grep bothcond
+define void @test(float* %x, i32 %ndat, float** %y, float %xcen, i32 %xmin, i32 %xmax, float %sigmal, float %contribution) {
+entry:
+ %tmp519 = icmp sgt i32 %xmin, %xmax ; [#uses=1]
+ br i1 %tmp519, label %return, label %bb.preheader
+
+bb.preheader: ; preds = %entry
+ %tmp3031 = fpext float %contribution to double ; [#uses=1]
+ %tmp32 = mul double %tmp3031, 5.000000e-01 ; [#uses=1]
+ %tmp3839 = fpext float %sigmal to double ; [#uses=1]
+ br label %bb
+
+bb: ; preds = %bb.preheader, %cond_next45
+ %i.01.0 = phi i32 [ %tmp47, %cond_next45 ], [ %xmin, %bb.preheader ] ; [#uses=6]
+ %tmp2 = icmp sgt i32 %i.01.0, -1 ; [#uses=1]
+ %tmp6 = icmp slt i32 %i.01.0, %ndat ; [#uses=1]
+ %bothcond = and i1 %tmp2, %tmp6 ; [#uses=1]
+ br i1 %bothcond, label %cond_true9, label %cond_next45
+
+cond_true9: ; preds = %bb
+ %tmp12 = getelementptr float* %x, i32 %i.01.0 ; [#uses=1]
+ %tmp13 = load float* %tmp12, align 4 ; [#uses=1]
+ %tmp15 = sub float %xcen, %tmp13 ; [#uses=1]
+ %tmp16 = tail call float @fabsf( float %tmp15 ) ; [#uses=1]
+ %tmp18 = fdiv float %tmp16, %sigmal ; [#uses=1]
+ %tmp21 = load float** %y, align 4 ; [#uses=2]
+ %tmp27 = getelementptr float* %tmp21, i32 %i.01.0 ; [#uses=1]
+ %tmp28 = load float* %tmp27, align 4 ; [#uses=1]
+ %tmp2829 = fpext float %tmp28 to double ; [#uses=1]
+ %tmp34 = sub float -0.000000e+00, %tmp18 ; [#uses=1]
+ %tmp3435 = fpext float %tmp34 to double ; [#uses=1]
+ %tmp36 = tail call double @exp( double %tmp3435 ) ; [#uses=1]
+ %tmp37 = mul double %tmp32, %tmp36 ; [#uses=1]
+ %tmp40 = fdiv double %tmp37, %tmp3839 ; [#uses=1]
+ %tmp41 = add double %tmp2829, %tmp40 ; [#uses=1]
+ %tmp4142 = fptrunc double %tmp41 to float ; [#uses=1]
+ %tmp44 = getelementptr float* %tmp21, i32 %i.01.0 ; [#uses=1]
+ store float %tmp4142, float* %tmp44, align 4
+ br label %cond_next45
+
+cond_next45: ; preds = %bb, %cond_true9
+ %tmp47 = add i32 %i.01.0, 1 ; [#uses=2]
+ %tmp51 = icmp sgt i32 %tmp47, %xmax ; [#uses=1]
+ br i1 %tmp51, label %return.loopexit, label %bb
+
+return.loopexit: ; preds = %cond_next45
+ br label %return
+
+return: ; preds = %return.loopexit, %entry
+ ret void
+}
+
+declare float @fabsf(float)
+
+declare double @exp(double)
From dpatel at apple.com Tue Sep 25 10:43:08 2007
From: dpatel at apple.com (Devang Patel)
Date: Tue, 25 Sep 2007 17:43:08 -0000
Subject: [llvm-commits] [llvm] r42300 -
/llvm/trunk/lib/Transforms/Scalar/LoopIndexSplit.cpp
Message-ID: <200709251743.l8PHh8FV020177@zion.cs.uiuc.edu>
Author: dpatel
Date: Tue Sep 25 12:43:08 2007
New Revision: 42300
URL: http://llvm.org/viewvc/llvm-project?rev=42300&view=rev
Log:
doh..
Modified:
llvm/trunk/lib/Transforms/Scalar/LoopIndexSplit.cpp
Modified: llvm/trunk/lib/Transforms/Scalar/LoopIndexSplit.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopIndexSplit.cpp?rev=42300&r1=42299&r2=42300&view=diff
==============================================================================
--- llvm/trunk/lib/Transforms/Scalar/LoopIndexSplit.cpp (original)
+++ llvm/trunk/lib/Transforms/Scalar/LoopIndexSplit.cpp Tue Sep 25 12:43:08 2007
@@ -1010,7 +1010,7 @@
}
}
- return return;
+ return true;
}
From evan.cheng at apple.com Tue Sep 25 10:47:38 2007
From: evan.cheng at apple.com (Evan Cheng)
Date: Tue, 25 Sep 2007 17:47:38 -0000
Subject: [llvm-commits] [llvm] r42302 -
/llvm/trunk/test/CodeGen/X86/2006-11-17-IllegalMove.ll
Message-ID: <200709251747.l8PHlcYM020411@zion.cs.uiuc.edu>
Author: evancheng
Date: Tue Sep 25 12:47:38 2007
New Revision: 42302
URL: http://llvm.org/viewvc/llvm-project?rev=42302&view=rev
Log:
Forgot to check in the changes. Fix test case so it doesn't break with any scheduling changes.
Modified:
llvm/trunk/test/CodeGen/X86/2006-11-17-IllegalMove.ll
Modified: llvm/trunk/test/CodeGen/X86/2006-11-17-IllegalMove.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2006-11-17-IllegalMove.ll?rev=42302&r1=42301&r2=42302&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/2006-11-17-IllegalMove.ll (original)
+++ llvm/trunk/test/CodeGen/X86/2006-11-17-IllegalMove.ll Tue Sep 25 12:47:38 2007
@@ -1,7 +1,7 @@
; RUN: llvm-upgrade < %s | llvm-as | llc -march=x86-64 | \
-; RUN: not grep {movb %sil, %ah}
+; RUN: grep movb | count 2
; RUN: llvm-upgrade < %s | llvm-as | llc -march=x86-64 | \
-; RUN: grep {movzbw %al, %ax}
+; RUN: grep movzbw
void %handle_vector_size_attribute() {
entry:
From dalej at apple.com Tue Sep 25 10:50:55 2007
From: dalej at apple.com (Dale Johannesen)
Date: Tue, 25 Sep 2007 17:50:55 -0000
Subject: [llvm-commits] [llvm] r42303 - in /llvm/trunk/test/CodeGen/X86:
ldzero.ll nancvt.ll
Message-ID: <200709251750.l8PHotYr020560@zion.cs.uiuc.edu>
Author: johannes
Date: Tue Sep 25 12:50:55 2007
New Revision: 42303
URL: http://llvm.org/viewvc/llvm-project?rev=42303&view=rev
Log:
Some tests for APFloat conversions.
Added:
llvm/trunk/test/CodeGen/X86/ldzero.ll
llvm/trunk/test/CodeGen/X86/nancvt.ll
Added: llvm/trunk/test/CodeGen/X86/ldzero.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/ldzero.ll?rev=42303&view=auto
==============================================================================
--- llvm/trunk/test/CodeGen/X86/ldzero.ll (added)
+++ llvm/trunk/test/CodeGen/X86/ldzero.ll Tue Sep 25 12:50:55 2007
@@ -0,0 +1,43 @@
+; RUN: llvm-as < %s | llc
+; verify PR 1700 is still fixed
+; ModuleID = 'hh.c'
+target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128"
+target triple = "i686-apple-darwin8"
+
+define x86_fp80 @x() {
+entry:
+ %retval = alloca x86_fp80, align 16 ; [#uses=2]
+ %tmp = alloca x86_fp80, align 16 ; [#uses=2]
+ %d = alloca double, align 8 ; [#uses=2]
+ %"alloca point" = bitcast i32 0 to i32 ; [#uses=0]
+ store double 0.000000e+00, double* %d, align 8
+ %tmp1 = load double* %d, align 8 ; [#uses=1]
+ %tmp12 = fpext double %tmp1 to x86_fp80 ; [#uses=1]
+ store x86_fp80 %tmp12, x86_fp80* %tmp, align 16
+ %tmp3 = load x86_fp80* %tmp, align 16 ; [#uses=1]
+ store x86_fp80 %tmp3, x86_fp80* %retval, align 16
+ br label %return
+
+return: ; preds = %entry
+ %retval4 = load x86_fp80* %retval ; [#uses=1]
+ ret x86_fp80 %retval4
+}
+
+define double @y() {
+entry:
+ %retval = alloca double, align 8 ; [#uses=2]
+ %tmp = alloca double, align 8 ; [#uses=2]
+ %ld = alloca x86_fp80, align 16 ; [#uses=2]
+ %"alloca point" = bitcast i32 0 to i32 ; [#uses=0]
+ store x86_fp80 0xK00000000000000000000, x86_fp80* %ld, align 16
+ %tmp1 = load x86_fp80* %ld, align 16 ; [#uses=1]
+ %tmp12 = fptrunc x86_fp80 %tmp1 to double ; [#uses=1]
+ store double %tmp12, double* %tmp, align 8
+ %tmp3 = load double* %tmp, align 8 ; [#uses=1]
+ store double %tmp3, double* %retval, align 8
+ br label %return
+
+return: ; preds = %entry
+ %retval4 = load double* %retval ; [#uses=1]
+ ret double %retval4
+}
Added: llvm/trunk/test/CodeGen/X86/nancvt.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/nancvt.ll?rev=42303&view=auto
==============================================================================
--- llvm/trunk/test/CodeGen/X86/nancvt.ll (added)
+++ llvm/trunk/test/CodeGen/X86/nancvt.ll Tue Sep 25 12:50:55 2007
@@ -0,0 +1,180 @@
+; RUN: llvm-as < %s | opt -std-compile-opts | llc | grep 2147027116 | count 3
+; RUN: llvm-as < %s | opt -std-compile-opts | llc | grep 2147228864 | count 3
+; RUN: llvm-as < %s | opt -std-compile-opts | llc | grep 2146502828 | count 3
+; RUN: llvm-as < %s | opt -std-compile-opts | llc | grep 2143034560 | count 3
+; Compile time conversions of NaNs.
+; ModuleID = 'nan2.c'
+target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128"
+target triple = "i686-apple-darwin8"
+ %struct..0anon = type { float }
+ %struct..1anon = type { double }
+ at fnan = constant [3 x i32] [ i32 2143831397, i32 2143831396, i32 2143831398 ] ; <[3 x i32]*> [#uses=1]
+ at dnan = constant [3 x i64] [ i64 9223235251041752696, i64 9223235251041752697, i64 9223235250773317239 ], align 8 ; <[3 x i64]*> [#uses=1]
+ at fsnan = constant [3 x i32] [ i32 2139637093, i32 2139637092, i32 2139637094 ] ; <[3 x i32]*> [#uses=1]
+ at dsnan = constant [3 x i64] [ i64 9220983451228067448, i64 9220983451228067449, i64 9220983450959631991 ], align 8 ; <[3 x i64]*> [#uses=1]
+ at .str = internal constant [10 x i8] c"%08x%08x\0A\00" ; <[10 x i8]*> [#uses=2]
+ at .str1 = internal constant [6 x i8] c"%08x\0A\00" ; <[6 x i8]*> [#uses=2]
+
+define i32 @main() {
+entry:
+ %retval = alloca i32, align 4 ; [#uses=1]
+ %i = alloca i32, align 4 ; [#uses=20]
+ %uf = alloca %struct..0anon, align 4 ; <%struct..0anon*> [#uses=8]
+ %ud = alloca %struct..1anon, align 8 ; <%struct..1anon*> [#uses=10]
+ %"alloca point" = bitcast i32 0 to i32 ; [#uses=0]
+ store i32 0, i32* %i, align 4
+ br label %bb23
+
+bb: ; preds = %bb23
+ %tmp = load i32* %i, align 4 ; [#uses=1]
+ %tmp1 = getelementptr [3 x i32]* @fnan, i32 0, i32 %tmp ; [#uses=1]
+ %tmp2 = load i32* %tmp1, align 4 ; [#uses=1]
+ %tmp3 = getelementptr %struct..0anon* %uf, i32 0, i32 0 ; [#uses=1]
+ %tmp34 = bitcast float* %tmp3 to i32* ; [#uses=1]
+ store i32 %tmp2, i32* %tmp34, align 4
+ %tmp5 = getelementptr %struct..0anon* %uf, i32 0, i32 0 ; [#uses=1]
+ %tmp6 = load float* %tmp5, align 4 ; [#uses=1]
+ %tmp67 = fpext float %tmp6 to double ; [#uses=1]
+ %tmp8 = getelementptr %struct..1anon* %ud, i32 0, i32 0 ; [#uses=1]
+ store double %tmp67, double* %tmp8, align 8
+ %tmp9 = getelementptr %struct..1anon* %ud, i32 0, i32 0 ; [#uses=1]
+ %tmp910 = bitcast double* %tmp9 to i64* ; [#uses=1]
+ %tmp11 = load i64* %tmp910, align 8 ; [#uses=1]
+ %tmp1112 = trunc i64 %tmp11 to i32 ; [#uses=1]
+ %tmp13 = and i32 %tmp1112, -1 ; [#uses=1]
+ %tmp14 = getelementptr %struct..1anon* %ud, i32 0, i32 0 ; [#uses=1]
+ %tmp1415 = bitcast double* %tmp14 to i64* ; [#uses=1]
+ %tmp16 = load i64* %tmp1415, align 8 ; [#uses=1]
+ %.cast = zext i32 32 to i64 ; [#uses=1]
+ %tmp17 = ashr i64 %tmp16, %.cast ; [#uses=1]
+ %tmp1718 = trunc i64 %tmp17 to i32 ; [#uses=1]
+ %tmp19 = getelementptr [10 x i8]* @.str, i32 0, i32 0 ; [#uses=1]
+ %tmp20 = call i32 (i8*, ...)* @printf( i8* %tmp19, i32 %tmp1718, i32 %tmp13 ) ; [#uses=0]
+ %tmp21 = load i32* %i, align 4 ; [#uses=1]
+ %tmp22 = add i32 %tmp21, 1 ; [#uses=1]
+ store i32 %tmp22, i32* %i, align 4
+ br label %bb23
+
+bb23: ; preds = %bb, %entry
+ %tmp24 = load i32* %i, align 4 ; [#uses=1]
+ %tmp25 = icmp sle i32 %tmp24, 2 ; [#uses=1]
+ %tmp2526 = zext i1 %tmp25 to i8 ; [#uses=1]
+ %toBool = icmp ne i8 %tmp2526, 0 ; [#uses=1]
+ br i1 %toBool, label %bb, label %bb27
+
+bb27: ; preds = %bb23
+ store i32 0, i32* %i, align 4
+ br label %bb46
+
+bb28: ; preds = %bb46
+ %tmp29 = load i32* %i, align 4 ; [#uses=1]
+ %tmp30 = getelementptr [3 x i64]* @dnan, i32 0, i32 %tmp29 ; [#uses=1]
+ %tmp31 = load i64* %tmp30, align 8 ; [#uses=1]
+ %tmp32 = getelementptr %struct..1anon* %ud, i32 0, i32 0 ; [#uses=1]
+ %tmp3233 = bitcast double* %tmp32 to i64* ;