[llvm] r251800 - [SCEV][LV] Add SCEV Predicates and use them to re-implement stride versioning
Adam Nemet via llvm-commits
llvm-commits at lists.llvm.org
Thu Nov 5 10:07:39 PST 2015
Thanks!
> On Nov 5, 2015, at 7:59 AM, Silviu Baranga <Silviu.Baranga at arm.com> wrote:
>
> Hi Adam,
>
> I've added a comment in r252171.
>
> Thanks,
> Silviu
>
>>> -----Original Message-----
>>> From: anemet at apple.com <mailto:anemet at apple.com> [mailto:anemet at apple.com <mailto:anemet at apple.com>]
>>> Sent: 05 November 2015 05:40
>>> To: Silviu Baranga
>>> Cc: llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>
>>> Subject: Re: [llvm] r251800 - [SCEV][LV] Add SCEV Predicates and use them
>>> to re-implement stride versioning
>>>
>>>
>>>> On Nov 2, 2015, at 6:41 AM, Silviu Baranga via llvm-commits <llvm-
>>> commits at lists.llvm.org> wrote:
>>>>
>>>> Author: sbaranga
>>>> Date: Mon Nov 2 08:41:02 2015
>>>> New Revision: 251800
>>>>
>>>> URL: http://llvm.org/viewvc/llvm-project?rev=251800&view=rev
>>>> Log:
>>>> [SCEV][LV] Add SCEV Predicates and use them to re-implement stride
>>> versioning
>>>>
>>>> Summary:
>>>> SCEV Predicates represent conditions that typically cannot be derived
>>> from
>>>> static analysis, but can be used to reduce SCEV expressions to forms which
>>> are
>>>> usable for different optimizers.
>>>>
>>>> ScalarEvolution now has the rewriteUsingPredicate method which can
>>> simplify a
>>>> SCEV expression using a SCEVPredicateSet. The normal workflow of a pass
>>> using
>>>> SCEVPredicates would be to hold a SCEVPredicateSet and every time
>>> assumptions
>>>> need to be made a new SCEV Predicate would be created and added to
>>> the set.
>>>> Each time after calling getSCEV, the user will call the rewriteUsingPredicate
>>>> method.
>>>>
>>>> We add two types of predicates
>>>> SCEVPredicateSet - implements a set of predicates
>>>> SCEVEqualPredicate - tests for equality between two SCEV expressions
>>>>
>>>> We use the SCEVEqualPredicate to re-implement stride versioning. Every
>>> time we
>>>> version a stride, we will add a SCEVEqualPredicate to the context.
>>>> Instead of adding specific stride checks, LoopVectorize now adds a more
>>>> generic SCEV check.
>>>>
>>>> We only need to add support for this in the LoopVectorizer since this is the
>>>> only pass that will do stride versioning.
>>>>
>>>> Reviewers: mzolotukhin, anemet, hfinkel, sanjoy
>>>>
>>>> Subscribers: sanjoy, hfinkel, rengolin, jmolloy, llvm-commits
>>>>
>>>> Differential Revision: http://reviews.llvm.org/D13595
>>>>
>>>> Modified:
>>>> llvm/trunk/include/llvm/Analysis/LoopAccessAnalysis.h
>>>> llvm/trunk/include/llvm/Analysis/ScalarEvolution.h
>>>> llvm/trunk/include/llvm/Analysis/ScalarEvolutionExpander.h
>>>> llvm/trunk/lib/Analysis/LoopAccessAnalysis.cpp
>>>> llvm/trunk/lib/Analysis/ScalarEvolution.cpp
>>>> llvm/trunk/lib/Analysis/ScalarEvolutionExpander.cpp
>>>> llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp
>>>>
>>>> Modified: llvm/trunk/include/llvm/Analysis/LoopAccessAnalysis.h
>>>> URL: http://llvm.org/viewvc/llvm-
>>> project/llvm/trunk/include/llvm/Analysis/LoopAccessAnalysis.h?rev=25180
>>> 0&r1=251799&r2=251800&view=diff
>>>>
>>> ==========================================================
>>> ====================
>>>> --- llvm/trunk/include/llvm/Analysis/LoopAccessAnalysis.h (original)
>>>> +++ llvm/trunk/include/llvm/Analysis/LoopAccessAnalysis.h Mon Nov 2
>>> 08:41:02 2015
>>>> @@ -32,6 +32,7 @@ class DataLayout;
>>>> class ScalarEvolution;
>>>> class Loop;
>>>> class SCEV;
>>>> +class SCEVUnionPredicate;
>>>>
>>>> /// Optimization analysis message produced during vectorization.
>>> Messages inform
>>>> /// the user why vectorization did not occur.
>>>> @@ -176,10 +177,11 @@ public:
>>>> const SmallVectorImpl<Instruction *> &Instrs) const;
>>>> };
>>>>
>>>> - MemoryDepChecker(ScalarEvolution *Se, const Loop *L)
>>>> + MemoryDepChecker(ScalarEvolution *Se, const Loop *L,
>>>> + SCEVUnionPredicate &Preds)
>>>> : SE(Se), InnermostLoop(L), AccessIdx(0),
>>>> ShouldRetryWithRuntimeCheck(false), SafeForVectorization(true),
>>>> - RecordInterestingDependences(true) {}
>>>> + RecordInterestingDependences(true), Preds(Preds) {}
>>>>
>>>> /// \brief Register the location (instructions are given increasing
>>> numbers)
>>>> /// of a write access.
>>>> @@ -289,6 +291,15 @@ private:
>>>> /// \brief Check whether the data dependence could prevent store-load
>>>> /// forwarding.
>>>> bool couldPreventStoreLoadForward(unsigned Distance, unsigned
>>> TypeByteSize);
>>>> +
>>>> + /// The SCEV predicate containing all the SCEV-related assumptions.
>>>> + /// The dependence checker needs this in order to convert SCEVs of
>>> pointers
>>>> + /// to more accurate expressions in the context of existing assumptions.
>>>> + /// We also need this in case assumptions about SCEV expressions need
>>> to
>>>> + /// be made in order to avoid unknown dependences. For example we
>>> might
>>>> + /// assume a unit stride for a pointer in order to prove that a memory
>>> access
>>>> + /// is strided and doesn't wrap.
>>>> + SCEVUnionPredicate &Preds;
>>>> };
>>>>
>>>> /// \brief Holds information about the memory runtime legality checks to
>>> verify
>>>> @@ -330,8 +341,13 @@ public:
>>>> }
>>>>
>>>> /// Insert a pointer and calculate the start and end SCEVs.
>>>> + /// \p We need Preds in order to compute the SCEV expression of the
>>> pointer
>>>> + /// according to the assumptions that we've made during the analysis.
>>>> + /// The method might also version the pointer stride according to \p
>>> Strides,
>>>> + /// and change \p Preds.
>>>> void insert(Loop *Lp, Value *Ptr, bool WritePtr, unsigned DepSetId,
>>>> - unsigned ASId, const ValueToValueMap &Strides);
>>>> + unsigned ASId, const ValueToValueMap &Strides,
>>>> + SCEVUnionPredicate &Preds);
>>>>
>>>> /// \brief No run-time memory checking is necessary.
>>>> bool empty() const { return Pointers.empty(); }
>>>> @@ -537,6 +553,15 @@ public:
>>>> return StoreToLoopInvariantAddress;
>>>> }
>>>>
>>>> + /// The SCEV predicate contains all the SCEV-related assumptions.
>>>> + /// The is used to keep track of the minimal set of assumptions on SCEV
>>>> + /// expressions that the analysis needs to make in order to return a
>>>> + /// meaningful result. All SCEV expressions during the analysis should be
>>>> + /// re-written (and therefore simplified) according to Preds.
>>>> + /// A user of LoopAccessAnalysis will need to emit the runtime checks
>>>> + /// associated with this predicate.
>>>> + SCEVUnionPredicate Preds;
>>>
>>> Can you please also add a comment probably right before LoopAccessInfo
>>> that both sets of informations it provides (run-time alias checks and
>>> dependence information) are only correct if the predicates contained here
>>> return are true (at run time).
>>>
>>> Thanks,
>>> Adam
>>>
>>>> +
>>>> private:
>>>> /// \brief Analyze the loop. Substitute symbolic strides using Strides.
>>>> void analyzeLoop(const ValueToValueMap &Strides);
>>>> @@ -583,19 +608,26 @@ private:
>>>> Value *stripIntegerCast(Value *V);
>>>>
>>>> ///\brief Return the SCEV corresponding to a pointer with the symbolic
>>> stride
>>>> -///replaced with constant one.
>>>> +/// replaced with constant one, assuming \p Preds is true.
>>>> +///
>>>> +/// If necessary this method will version the stride of the pointer
>>> according
>>>> +/// to \p PtrToStride and therefore add a new predicate to \p Preds.
>>>> ///
>>>> /// If \p OrigPtr is not null, use it to look up the stride value instead of \p
>>>> /// Ptr. \p PtrToStride provides the mapping between the pointer value
>>> and its
>>>> /// stride as collected by LoopVectorizationLegality::collectStridedAccess.
>>>> const SCEV *replaceSymbolicStrideSCEV(ScalarEvolution *SE,
>>>> const ValueToValueMap &PtrToStride,
>>>> - Value *Ptr, Value *OrigPtr = nullptr);
>>>> + SCEVUnionPredicate &Preds, Value *Ptr,
>>>> + Value *OrigPtr = nullptr);
>>>>
>>>> /// \brief Check the stride of the pointer and ensure that it does not wrap
>>> in
>>>> -/// the address space.
>>>> +/// the address space, assuming \p Preds is true.
>>>> +///
>>>> +/// If necessary this method will version the stride of the pointer
>>> according
>>>> +/// to \p PtrToStride and therefore add a new predicate to \p Preds.
>>>> int isStridedPtr(ScalarEvolution *SE, Value *Ptr, const Loop *Lp,
>>>> - const ValueToValueMap &StridesMap);
>>>> + const ValueToValueMap &StridesMap, SCEVUnionPredicate
>>> &Preds);
>>>>
>>>> /// \brief This analysis provides dependence information for the memory
>>> accesses
>>>> /// of a loop.
>>>>
>>>> Modified: llvm/trunk/include/llvm/Analysis/ScalarEvolution.h
>>>> URL: http://llvm.org/viewvc/llvm-
>>> project/llvm/trunk/include/llvm/Analysis/ScalarEvolution.h?rev=251800&r1
>>> =251799&r2=251800&view=diff
>>>>
>>> ==========================================================
>>> ====================
>>>> --- llvm/trunk/include/llvm/Analysis/ScalarEvolution.h (original)
>>>> +++ llvm/trunk/include/llvm/Analysis/ScalarEvolution.h Mon Nov 2
>>> 08:41:02 2015
>>>> @@ -48,10 +48,15 @@ namespace llvm {
>>>> class Loop;
>>>> class LoopInfo;
>>>> class Operator;
>>>> - class SCEVUnknown;
>>>> - class SCEVAddRecExpr;
>>>> class SCEV;
>>>> - template<> struct FoldingSetTrait<SCEV>;
>>>> + class SCEVAddRecExpr;
>>>> + class SCEVConstant;
>>>> + class SCEVExpander;
>>>> + class SCEVPredicate;
>>>> + class SCEVUnknown;
>>>> +
>>>> + template <> struct FoldingSetTrait<SCEV>;
>>>> + template <> struct FoldingSetTrait<SCEVPredicate>;
>>>>
>>>> /// This class represents an analyzed expression in the program. These
>>> are
>>>> /// opaque objects that the client is not allowed to do much with directly.
>>>> @@ -164,6 +169,148 @@ namespace llvm {
>>>> static bool classof(const SCEV *S);
>>>> };
>>>>
>>>> + /// SCEVPredicate - This class represents an assumption made using
>>> SCEV
>>>> + /// expressions which can be checked at run-time.
>>>> + class SCEVPredicate : public FoldingSetNode {
>>>> + friend struct FoldingSetTrait<SCEVPredicate>;
>>>> +
>>>> + /// A reference to an Interned FoldingSetNodeID for this node. The
>>>> + /// ScalarEvolution's BumpPtrAllocator holds the data.
>>>> + FoldingSetNodeIDRef FastID;
>>>> +
>>>> + public:
>>>> + enum SCEVPredicateKind { P_Union, P_Equal };
>>>> +
>>>> + protected:
>>>> + SCEVPredicateKind Kind;
>>>> +
>>>> + public:
>>>> + SCEVPredicate(const FoldingSetNodeIDRef ID, SCEVPredicateKind
>>> Kind);
>>>> +
>>>> + virtual ~SCEVPredicate() {}
>>>> +
>>>> + SCEVPredicateKind getKind() const { return Kind; }
>>>> +
>>>> + /// \brief Returns the estimated complexity of this predicate.
>>>> + /// This is roughly measured in the number of run-time checks
>>> required.
>>>> + virtual unsigned getComplexity() { return 1; }
>>>> +
>>>> + /// \brief Returns true if the predicate is always true. This means that
>>> no
>>>> + /// assumptions were made and nothing needs to be checked at run-
>>> time.
>>>> + virtual bool isAlwaysTrue() const = 0;
>>>> +
>>>> + /// \brief Returns true if this predicate implies \p N.
>>>> + virtual bool implies(const SCEVPredicate *N) const = 0;
>>>> +
>>>> + /// \brief Prints a textual representation of this predicate with an
>>>> + /// indentation of \p Depth.
>>>> + virtual void print(raw_ostream &OS, unsigned Depth = 0) const = 0;
>>>> +
>>>> + /// \brief Returns the SCEV to which this predicate applies, or nullptr
>>>> + /// if this is a SCEVUnionPredicate.
>>>> + virtual const SCEV *getExpr() const = 0;
>>>> + };
>>>> +
>>>> + inline raw_ostream &operator<<(raw_ostream &OS, const
>>> SCEVPredicate &P) {
>>>> + P.print(OS);
>>>> + return OS;
>>>> + }
>>>> +
>>>> + // Specialize FoldingSetTrait for SCEVPredicate to avoid needing to
>>> compute
>>>> + // temporary FoldingSetNodeID values.
>>>> + template <>
>>>> + struct FoldingSetTrait<SCEVPredicate>
>>>> + : DefaultFoldingSetTrait<SCEVPredicate> {
>>>> +
>>>> + static void Profile(const SCEVPredicate &X, FoldingSetNodeID &ID) {
>>>> + ID = X.FastID;
>>>> + }
>>>> +
>>>> + static bool Equals(const SCEVPredicate &X, const FoldingSetNodeID
>>> &ID,
>>>> + unsigned IDHash, FoldingSetNodeID &TempID) {
>>>> + return ID == X.FastID;
>>>> + }
>>>> + static unsigned ComputeHash(const SCEVPredicate &X,
>>>> + FoldingSetNodeID &TempID) {
>>>> + return X.FastID.ComputeHash();
>>>> + }
>>>> + };
>>>> +
>>>> + /// SCEVEqualPredicate - This class represents an assumption that two
>>> SCEV
>>>> + /// expressions are equal, and this can be checked at run-time. We
>>> assume
>>>> + /// that the left hand side is a SCEVUnknown and the right hand side a
>>>> + /// constant.
>>>> + class SCEVEqualPredicate : public SCEVPredicate {
>>>> + /// We assume that LHS == RHS, where LHS is a SCEVUnknown and RHS
>>> a
>>>> + /// constant.
>>>> + const SCEVUnknown *LHS;
>>>> + const SCEVConstant *RHS;
>>>> +
>>>> + public:
>>>> + SCEVEqualPredicate(const FoldingSetNodeIDRef ID, const
>>> SCEVUnknown *LHS,
>>>> + const SCEVConstant *RHS);
>>>> +
>>>> + /// Implementation of the SCEVPredicate interface
>>>> + bool implies(const SCEVPredicate *N) const override;
>>>> + void print(raw_ostream &OS, unsigned Depth = 0) const override;
>>>> + bool isAlwaysTrue() const override;
>>>> + const SCEV *getExpr() const;
>>>> +
>>>> + /// \brief Returns the left hand side of the equality.
>>>> + const SCEVUnknown *getLHS() const { return LHS; }
>>>> +
>>>> + /// \brief Returns the right hand side of the equality.
>>>> + const SCEVConstant *getRHS() const { return RHS; }
>>>> +
>>>> + /// Methods for support type inquiry through isa, cast, and dyn_cast:
>>>> + static inline bool classof(const SCEVPredicate *P) {
>>>> + return P->getKind() == P_Equal;
>>>> + }
>>>> + };
>>>> +
>>>> + /// SCEVUnionPredicate - This class represents a composition of other
>>>> + /// SCEV predicates, and is the class that most clients will interact with.
>>>> + /// This is equivalent to a logical "AND" of all the predicates in the union.
>>>> + class SCEVUnionPredicate : public SCEVPredicate {
>>>> + private:
>>>> + typedef DenseMap<const SCEV *, SmallVector<const SCEVPredicate *,
>>> 4>>
>>>> + PredicateMap;
>>>> +
>>>> + /// Vector with references to all predicates in this union.
>>>> + SmallVector<const SCEVPredicate *, 16> Preds;
>>>> + /// Maps SCEVs to predicates for quick look-ups.
>>>> + PredicateMap SCEVToPreds;
>>>> +
>>>> + public:
>>>> + SCEVUnionPredicate();
>>>> +
>>>> + const SmallVectorImpl<const SCEVPredicate *> &getPredicates() const
>>> {
>>>> + return Preds;
>>>> + }
>>>> +
>>>> + /// \brief Adds a predicate to this union.
>>>> + void add(const SCEVPredicate *N);
>>>> +
>>>> + /// \brief Returns a reference to a vector containing all predicates
>>>> + /// which apply to \p Expr.
>>>> + ArrayRef<const SCEVPredicate *> getPredicatesForExpr(const SCEV
>>> *Expr);
>>>> +
>>>> + /// Implementation of the SCEVPredicate interface
>>>> + bool isAlwaysTrue() const override;
>>>> + bool implies(const SCEVPredicate *N) const override;
>>>> + void print(raw_ostream &OS, unsigned Depth) const;
>>>> + const SCEV *getExpr() const override;
>>>> +
>>>> + /// \brief We estimate the complexity of a union predicate as the size
>>>> + /// number of predicates in the union.
>>>> + unsigned getComplexity() override { return Preds.size(); }
>>>> +
>>>> + /// Methods for support type inquiry through isa, cast, and dyn_cast:
>>>> + static inline bool classof(const SCEVPredicate *P) {
>>>> + return P->getKind() == P_Union;
>>>> + }
>>>> + };
>>>> +
>>>> /// The main scalar evolution driver. Because client code (intentionally)
>>>> /// can't do much with the SCEV objects directly, they must ask this class
>>>> /// for services.
>>>> @@ -1108,6 +1255,12 @@ namespace llvm {
>>>> return F.getParent()->getDataLayout();
>>>> }
>>>>
>>>> + const SCEVPredicate *getEqualPredicate(const SCEVUnknown *LHS,
>>>> + const SCEVConstant *RHS);
>>>> +
>>>> + /// Re-writes the SCEV according to the Predicates in \p Preds.
>>>> + const SCEV *rewriteUsingPredicate(const SCEV *Scev,
>>> SCEVUnionPredicate &A);
>>>> +
>>>> private:
>>>> /// Compute the backedge taken count knowing the interval difference,
>>> the
>>>> /// stride and presence of the equality in the comparison.
>>>> @@ -1128,6 +1281,7 @@ namespace llvm {
>>>>
>>>> private:
>>>> FoldingSet<SCEV> UniqueSCEVs;
>>>> + FoldingSet<SCEVPredicate> UniquePreds;
>>>> BumpPtrAllocator SCEVAllocator;
>>>>
>>>> /// The head of a linked list of all SCEVUnknown values that have been
>>>>
>>>> Modified: llvm/trunk/include/llvm/Analysis/ScalarEvolutionExpander.h
>>>> URL: http://llvm.org/viewvc/llvm-
>>> project/llvm/trunk/include/llvm/Analysis/ScalarEvolutionExpander.h?rev=2
>>> 51800&r1=251799&r2=251800&view=diff
>>>>
>>> ==========================================================
>>> ====================
>>>> --- llvm/trunk/include/llvm/Analysis/ScalarEvolutionExpander.h (original)
>>>> +++ llvm/trunk/include/llvm/Analysis/ScalarEvolutionExpander.h Mon
>>> Nov 2 08:41:02 2015
>>>> @@ -151,6 +151,22 @@ namespace llvm {
>>>> /// block.
>>>> Value *expandCodeFor(const SCEV *SH, Type *Ty, Instruction *I);
>>>>
>>>> + /// \brief Generates a code sequence that evaluates this predicate.
>>>> + /// The inserted instructions will be at position \p Loc.
>>>> + /// The result will be of type i1 and will have a value of 0 when the
>>>> + /// predicate is false and 1 otherwise.
>>>> + Value *expandCodeForPredicate(const SCEVPredicate *Pred,
>>> Instruction *Loc);
>>>> +
>>>> + /// \brief A specialized variant of expandCodeForPredicate, handling
>>> the
>>>> + /// case when we are expanding code for a SCEVEqualPredicate.
>>>> + Value *expandEqualPredicate(const SCEVEqualPredicate *Pred,
>>>> + Instruction *Loc);
>>>> +
>>>> + /// \brief A specialized variant of expandCodeForPredicate, handling
>>> the
>>>> + /// case when we are expanding code for a SCEVUnionPredicate.
>>>> + Value *expandUnionPredicate(const SCEVUnionPredicate *Pred,
>>>> + Instruction *Loc);
>>>> +
>>>> /// \brief Set the current IV increment loop and position.
>>>> void setIVIncInsertPos(const Loop *L, Instruction *Pos) {
>>>> assert(!CanonicalMode &&
>>>>
>>>> Modified: llvm/trunk/lib/Analysis/LoopAccessAnalysis.cpp
>>>> URL: http://llvm.org/viewvc/llvm-
>>> project/llvm/trunk/lib/Analysis/LoopAccessAnalysis.cpp?rev=251800&r1=25
>>> 1799&r2=251800&view=diff
>>>>
>>> ==========================================================
>>> ====================
>>>> --- llvm/trunk/lib/Analysis/LoopAccessAnalysis.cpp (original)
>>>> +++ llvm/trunk/lib/Analysis/LoopAccessAnalysis.cpp Mon Nov 2 08:41:02
>>> 2015
>>>> @@ -89,8 +89,8 @@ Value *llvm::stripIntegerCast(Value *V)
>>>>
>>>> const SCEV *llvm::replaceSymbolicStrideSCEV(ScalarEvolution *SE,
>>>> const ValueToValueMap &PtrToStride,
>>>> + SCEVUnionPredicate &Preds,
>>>> Value *Ptr, Value *OrigPtr) {
>>>> -
>>>> const SCEV *OrigSCEV = SE->getSCEV(Ptr);
>>>>
>>>> // If there is an entry in the map return the SCEV of the pointer with the
>>>> @@ -108,22 +108,28 @@ const SCEV *llvm::replaceSymbolicStrideS
>>>> ValueToValueMap RewriteMap;
>>>> RewriteMap[StrideVal] = One;
>>>>
>>>> - const SCEV *ByOne =
>>>> - SCEVParameterRewriter::rewrite(OrigSCEV, *SE, RewriteMap, true);
>>>> + const auto *U = cast<SCEVUnknown>(SE->getSCEV(StrideVal));
>>>> + const auto *CT =
>>>> + static_cast<const SCEVConstant *>(SE->getOne(StrideVal-
>>>> getType()));
>>>> +
>>>> + Preds.add(SE->getEqualPredicate(U, CT));
>>>> +
>>>> + const SCEV *ByOne = SE->rewriteUsingPredicate(OrigSCEV, Preds);
>>>> DEBUG(dbgs() << "LAA: Replacing SCEV: " << *OrigSCEV << " by: " <<
>>> *ByOne
>>>> << "\n");
>>>> return ByOne;
>>>> }
>>>>
>>>> // Otherwise, just return the SCEV of the original pointer.
>>>> - return SE->getSCEV(Ptr);
>>>> + return OrigSCEV;
>>>> }
>>>>
>>>> void RuntimePointerChecking::insert(Loop *Lp, Value *Ptr, bool WritePtr,
>>>> unsigned DepSetId, unsigned ASId,
>>>> - const ValueToValueMap &Strides) {
>>>> + const ValueToValueMap &Strides,
>>>> + SCEVUnionPredicate &Preds) {
>>>> // Get the stride replaced scev.
>>>> - const SCEV *Sc = replaceSymbolicStrideSCEV(SE, Strides, Ptr);
>>>> + const SCEV *Sc = replaceSymbolicStrideSCEV(SE, Strides, Preds, Ptr);
>>>> const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(Sc);
>>>> assert(AR && "Invalid addrec expression");
>>>> const SCEV *Ex = SE->getBackedgeTakenCount(Lp);
>>>> @@ -417,9 +423,9 @@ public:
>>>> typedef SmallPtrSet<MemAccessInfo, 8> MemAccessInfoSet;
>>>>
>>>> AccessAnalysis(const DataLayout &Dl, AliasAnalysis *AA, LoopInfo *LI,
>>>> - MemoryDepChecker::DepCandidates &DA)
>>>> - : DL(Dl), AST(*AA), LI(LI), DepCands(DA),
>>>> - IsRTCheckAnalysisNeeded(false) {}
>>>> + MemoryDepChecker::DepCandidates &DA, SCEVUnionPredicate
>>> &Preds)
>>>> + : DL(Dl), AST(*AA), LI(LI), DepCands(DA),
>>> IsRTCheckAnalysisNeeded(false),
>>>> + Preds(Preds) {}
>>>>
>>>> /// \brief Register a load and whether it is only read from.
>>>> void addLoad(MemoryLocation &Loc, bool IsReadOnly) {
>>>> @@ -504,14 +510,18 @@ private:
>>>> /// (i.e. ShouldRetryWithRuntimeCheck), isDependencyCheckNeeded is
>>> cleared
>>>> /// while this remains set if we have potentially dependent accesses.
>>>> bool IsRTCheckAnalysisNeeded;
>>>> +
>>>> + /// The SCEV predicate containing all the SCEV-related assumptions.
>>>> + SCEVUnionPredicate &Preds;
>>>> };
>>>>
>>>> } // end anonymous namespace
>>>>
>>>> /// \brief Check whether a pointer can participate in a runtime bounds
>>> check.
>>>> static bool hasComputableBounds(ScalarEvolution *SE,
>>>> - const ValueToValueMap &Strides, Value *Ptr) {
>>>> - const SCEV *PtrScev = replaceSymbolicStrideSCEV(SE, Strides, Ptr);
>>>> + const ValueToValueMap &Strides, Value *Ptr,
>>>> + Loop *L, SCEVUnionPredicate &Preds) {
>>>> + const SCEV *PtrScev = replaceSymbolicStrideSCEV(SE, Strides, Preds,
>>> Ptr);
>>>> const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(PtrScev);
>>>> if (!AR)
>>>> return false;
>>>> @@ -554,11 +564,11 @@ bool AccessAnalysis::canCheckPtrAtRT(Run
>>>> else
>>>> ++NumReadPtrChecks;
>>>>
>>>> - if (hasComputableBounds(SE, StridesMap, Ptr) &&
>>>> + if (hasComputableBounds(SE, StridesMap, Ptr, TheLoop, Preds) &&
>>>> // When we run after a failing dependency check we have to make
>>> sure
>>>> // we don't have wrapping pointers.
>>>> (!ShouldCheckStride ||
>>>> - isStridedPtr(SE, Ptr, TheLoop, StridesMap) == 1)) {
>>>> + isStridedPtr(SE, Ptr, TheLoop, StridesMap, Preds) == 1)) {
>>>> // The id of the dependence set.
>>>> unsigned DepId;
>>>>
>>>> @@ -572,7 +582,7 @@ bool AccessAnalysis::canCheckPtrAtRT(Run
>>>> // Each access has its own dependence set.
>>>> DepId = RunningDepId++;
>>>>
>>>> - RtCheck.insert(TheLoop, Ptr, IsWrite, DepId, ASId, StridesMap);
>>>> + RtCheck.insert(TheLoop, Ptr, IsWrite, DepId, ASId, StridesMap,
>>> Preds);
>>>>
>>>> DEBUG(dbgs() << "LAA: Found a runtime check ptr:" << *Ptr << '\n');
>>>> } else {
>>>> @@ -803,7 +813,8 @@ static bool isNoWrapAddRec(Value *Ptr, c
>>>>
>>>> /// \brief Check whether the access through \p Ptr has a constant stride.
>>>> int llvm::isStridedPtr(ScalarEvolution *SE, Value *Ptr, const Loop *Lp,
>>>> - const ValueToValueMap &StridesMap) {
>>>> + const ValueToValueMap &StridesMap,
>>>> + SCEVUnionPredicate &Preds) {
>>>> Type *Ty = Ptr->getType();
>>>> assert(Ty->isPointerTy() && "Unexpected non-ptr");
>>>>
>>>> @@ -815,7 +826,7 @@ int llvm::isStridedPtr(ScalarEvolution *
>>>> return 0;
>>>> }
>>>>
>>>> - const SCEV *PtrScev = replaceSymbolicStrideSCEV(SE, StridesMap, Ptr);
>>>> + const SCEV *PtrScev = replaceSymbolicStrideSCEV(SE, StridesMap,
>>> Preds, Ptr);
>>>>
>>>> const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(PtrScev);
>>>> if (!AR) {
>>>> @@ -1026,11 +1037,11 @@ MemoryDepChecker::isDependent(const
>>> MemA
>>>> BPtr->getType()->getPointerAddressSpace())
>>>> return Dependence::Unknown;
>>>>
>>>> - const SCEV *AScev = replaceSymbolicStrideSCEV(SE, Strides, APtr);
>>>> - const SCEV *BScev = replaceSymbolicStrideSCEV(SE, Strides, BPtr);
>>>> + const SCEV *AScev = replaceSymbolicStrideSCEV(SE, Strides, Preds,
>>> APtr);
>>>> + const SCEV *BScev = replaceSymbolicStrideSCEV(SE, Strides, Preds,
>>> BPtr);
>>>>
>>>> - int StrideAPtr = isStridedPtr(SE, APtr, InnermostLoop, Strides);
>>>> - int StrideBPtr = isStridedPtr(SE, BPtr, InnermostLoop, Strides);
>>>> + int StrideAPtr = isStridedPtr(SE, APtr, InnermostLoop, Strides, Preds);
>>>> + int StrideBPtr = isStridedPtr(SE, BPtr, InnermostLoop, Strides, Preds);
>>>>
>>>> const SCEV *Src = AScev;
>>>> const SCEV *Sink = BScev;
>>>> @@ -1429,7 +1440,7 @@ void LoopAccessInfo::analyzeLoop(const V
>>>>
>>>> MemoryDepChecker::DepCandidates DependentAccesses;
>>>> AccessAnalysis Accesses(TheLoop->getHeader()->getModule()-
>>>> getDataLayout(),
>>>> - AA, LI, DependentAccesses);
>>>> + AA, LI, DependentAccesses, Preds);
>>>>
>>>> // Holds the analyzed pointers. We don't want to call
>>> GetUnderlyingObjects
>>>> // multiple times on the same object. If the ptr is accessed twice, once
>>>> @@ -1480,7 +1491,8 @@ void LoopAccessInfo::analyzeLoop(const V
>>>> // read a few words, modify, and write a few words, and some of the
>>>> // words may be written to the same address.
>>>> bool IsReadOnlyPtr = false;
>>>> - if (Seen.insert(Ptr).second || !isStridedPtr(SE, Ptr, TheLoop, Strides)) {
>>>> + if (Seen.insert(Ptr).second ||
>>>> + !isStridedPtr(SE, Ptr, TheLoop, Strides, Preds)) {
>>>> ++NumReads;
>>>> IsReadOnlyPtr = true;
>>>> }
>>>> @@ -1730,7 +1742,7 @@ LoopAccessInfo::LoopAccessInfo(Loop *L,
>>>> const TargetLibraryInfo *TLI, AliasAnalysis *AA,
>>>> DominatorTree *DT, LoopInfo *LI,
>>>> const ValueToValueMap &Strides)
>>>> - : PtrRtChecking(SE), DepChecker(SE, L), TheLoop(L), SE(SE), DL(DL),
>>>> + : PtrRtChecking(SE), DepChecker(SE, L, Preds), TheLoop(L), SE(SE),
>>> DL(DL),
>>>> TLI(TLI), AA(AA), DT(DT), LI(LI), NumLoads(0), NumStores(0),
>>>> MaxSafeDepDistBytes(-1U), CanVecMem(false),
>>>> StoreToLoopInvariantAddress(false) {
>>>> @@ -1765,6 +1777,9 @@ void LoopAccessInfo::print(raw_ostream &
>>>> OS.indent(Depth) << "Store to invariant address was "
>>>> << (StoreToLoopInvariantAddress ? "" : "not ")
>>>> << "found in loop.\n";
>>>> +
>>>> + OS.indent(Depth) << "SCEV assumptions:\n";
>>>> + Preds.print(OS, Depth);
>>>> }
>>>>
>>>> const LoopAccessInfo &
>>>> @@ -1778,8 +1793,8 @@ LoopAccessAnalysis::getInfo(Loop *L, con
>>>>
>>>> if (!LAI) {
>>>> const DataLayout &DL = L->getHeader()->getModule()-
>>>> getDataLayout();
>>>> - LAI = llvm::make_unique<LoopAccessInfo>(L, SE, DL, TLI, AA, DT, LI,
>>>> - Strides);
>>>> + LAI =
>>>> + llvm::make_unique<LoopAccessInfo>(L, SE, DL, TLI, AA, DT, LI,
>>> Strides);
>>>> #ifndef NDEBUG
>>>> LAI->NumSymbolicStrides = Strides.size();
>>>> #endif
>>>>
>>>> Modified: llvm/trunk/lib/Analysis/ScalarEvolution.cpp
>>>> URL: http://llvm.org/viewvc/llvm-
>>> project/llvm/trunk/lib/Analysis/ScalarEvolution.cpp?rev=251800&r1=251799
>>> &r2=251800&view=diff
>>>>
>>> ==========================================================
>>> ====================
>>>> --- llvm/trunk/lib/Analysis/ScalarEvolution.cpp (original)
>>>> +++ llvm/trunk/lib/Analysis/ScalarEvolution.cpp Mon Nov 2 08:41:02 2015
>>>> @@ -9093,6 +9093,7 @@ ScalarEvolution::ScalarEvolution(ScalarE
>>>> UnsignedRanges(std::move(Arg.UnsignedRanges)),
>>>> SignedRanges(std::move(Arg.SignedRanges)),
>>>> UniqueSCEVs(std::move(Arg.UniqueSCEVs)),
>>>> + UniquePreds(std::move(Arg.UniquePreds)),
>>>> SCEVAllocator(std::move(Arg.SCEVAllocator)),
>>>> FirstUnknown(Arg.FirstUnknown) {
>>>> Arg.FirstUnknown = nullptr;
>>>> @@ -9596,3 +9597,134 @@ void ScalarEvolutionWrapperPass::getAnal
>>>> AU.addRequiredTransitive<DominatorTreeWrapperPass>();
>>>> AU.addRequiredTransitive<TargetLibraryInfoWrapperPass>();
>>>> }
>>>> +
>>>> +const SCEVPredicate *
>>>> +ScalarEvolution::getEqualPredicate(const SCEVUnknown *LHS,
>>>> + const SCEVConstant *RHS) {
>>>> + FoldingSetNodeID ID;
>>>> + // Unique this node based on the arguments
>>>> + ID.AddInteger(SCEVPredicate::P_Equal);
>>>> + ID.AddPointer(LHS);
>>>> + ID.AddPointer(RHS);
>>>> + void *IP = nullptr;
>>>> + if (const auto *S = UniquePreds.FindNodeOrInsertPos(ID, IP))
>>>> + return S;
>>>> + SCEVEqualPredicate *Eq = new (SCEVAllocator)
>>>> + SCEVEqualPredicate(ID.Intern(SCEVAllocator), LHS, RHS);
>>>> + UniquePreds.InsertNode(Eq, IP);
>>>> + return Eq;
>>>> +}
>>>> +
>>>> +class SCEVPredicateRewriter : public
>>> SCEVRewriteVisitor<SCEVPredicateRewriter> {
>>>> +public:
>>>> + static const SCEV *rewrite(const SCEV *Scev, ScalarEvolution &SE,
>>>> + SCEVUnionPredicate &A) {
>>>> + SCEVPredicateRewriter Rewriter(SE, A);
>>>> + return Rewriter.visit(Scev);
>>>> + }
>>>> +
>>>> + SCEVPredicateRewriter(ScalarEvolution &SE, SCEVUnionPredicate &P)
>>>> + : SCEVRewriteVisitor(SE), P(P) {}
>>>> +
>>>> + const SCEV *visitUnknown(const SCEVUnknown *Expr) {
>>>> + auto ExprPreds = P.getPredicatesForExpr(Expr);
>>>> + for (auto *Pred : ExprPreds)
>>>> + if (const auto *IPred = dyn_cast<const SCEVEqualPredicate>(Pred))
>>>> + if (IPred->getLHS() == Expr)
>>>> + return IPred->getRHS();
>>>> +
>>>> + return Expr;
>>>> + }
>>>> +
>>>> +private:
>>>> + SCEVUnionPredicate &P;
>>>> +};
>>>> +
>>>> +const SCEV *ScalarEvolution::rewriteUsingPredicate(const SCEV *Scev,
>>>> + SCEVUnionPredicate &Preds) {
>>>> + return SCEVPredicateRewriter::rewrite(Scev, *this, Preds);
>>>> +}
>>>> +
>>>> +/// SCEV predicates
>>>> +SCEVPredicate::SCEVPredicate(const FoldingSetNodeIDRef ID,
>>>> + SCEVPredicateKind Kind)
>>>> + : FastID(ID), Kind(Kind) {}
>>>> +
>>>> +SCEVEqualPredicate::SCEVEqualPredicate(const FoldingSetNodeIDRef ID,
>>>> + const SCEVUnknown *LHS,
>>>> + const SCEVConstant *RHS)
>>>> + : SCEVPredicate(ID, P_Equal), LHS(LHS), RHS(RHS) {}
>>>> +
>>>> +bool SCEVEqualPredicate::implies(const SCEVPredicate *N) const {
>>>> + const auto *Op = dyn_cast<const SCEVEqualPredicate>(N);
>>>> +
>>>> + if (!Op)
>>>> + return false;
>>>> +
>>>> + return Op->LHS == LHS && Op->RHS == RHS;
>>>> +}
>>>> +
>>>> +bool SCEVEqualPredicate::isAlwaysTrue() const { return false; }
>>>> +
>>>> +const SCEV *SCEVEqualPredicate::getExpr() const { return LHS; }
>>>> +
>>>> +void SCEVEqualPredicate::print(raw_ostream &OS, unsigned Depth)
>>> const {
>>>> + OS.indent(Depth) << "Equal predicate: " << *LHS << " == " << *RHS <<
>>> "\n";
>>>> +}
>>>> +
>>>> +/// Union predicates don't get cached so create a dummy set ID for it.
>>>> +SCEVUnionPredicate::SCEVUnionPredicate()
>>>> + : SCEVPredicate(FoldingSetNodeIDRef(nullptr, 0), P_Union) {}
>>>> +
>>>> +bool SCEVUnionPredicate::isAlwaysTrue() const {
>>>> + return std::all_of(Preds.begin(), Preds.end(),
>>>> + [](const SCEVPredicate *I) { return I->isAlwaysTrue(); });
>>>> +}
>>>> +
>>>> +ArrayRef<const SCEVPredicate *>
>>>> +SCEVUnionPredicate::getPredicatesForExpr(const SCEV *Expr) {
>>>> + auto I = SCEVToPreds.find(Expr);
>>>> + if (I == SCEVToPreds.end())
>>>> + return ArrayRef<const SCEVPredicate *>();
>>>> + return I->second;
>>>> +}
>>>> +
>>>> +bool SCEVUnionPredicate::implies(const SCEVPredicate *N) const {
>>>> + if (const auto *Set = dyn_cast<const SCEVUnionPredicate>(N))
>>>> + return std::all_of(
>>>> + Set->Preds.begin(), Set->Preds.end(),
>>>> + [this](const SCEVPredicate *I) { return this->implies(I); });
>>>> +
>>>> + auto ScevPredsIt = SCEVToPreds.find(N->getExpr());
>>>> + if (ScevPredsIt == SCEVToPreds.end())
>>>> + return false;
>>>> + auto &SCEVPreds = ScevPredsIt->second;
>>>> +
>>>> + return std::any_of(SCEVPreds.begin(), SCEVPreds.end(),
>>>> + [N](const SCEVPredicate *I) { return I->implies(N); });
>>>> +}
>>>> +
>>>> +const SCEV *SCEVUnionPredicate::getExpr() const { return nullptr; }
>>>> +
>>>> +void SCEVUnionPredicate::print(raw_ostream &OS, unsigned Depth)
>>> const {
>>>> + for (auto Pred : Preds)
>>>> + Pred->print(OS, Depth);
>>>> +}
>>>> +
>>>> +void SCEVUnionPredicate::add(const SCEVPredicate *N) {
>>>> + if (const auto *Set = dyn_cast<const SCEVUnionPredicate>(N)) {
>>>> + for (auto Pred : Set->Preds)
>>>> + add(Pred);
>>>> + return;
>>>> + }
>>>> +
>>>> + if (implies(N))
>>>> + return;
>>>> +
>>>> + const SCEV *Key = N->getExpr();
>>>> + assert(Key && "Only SCEVUnionPredicate doesn't have an "
>>>> + " associated expression!");
>>>> +
>>>> + SCEVToPreds[Key].push_back(N);
>>>> + Preds.push_back(N);
>>>> +}
>>>>
>>>> Modified: llvm/trunk/lib/Analysis/ScalarEvolutionExpander.cpp
>>>> URL: http://llvm.org/viewvc/llvm-
>>> project/llvm/trunk/lib/Analysis/ScalarEvolutionExpander.cpp?rev=251800&r
>>> 1=251799&r2=251800&view=diff
>>>>
>>> ==========================================================
>>> ====================
>>>> --- llvm/trunk/lib/Analysis/ScalarEvolutionExpander.cpp (original)
>>>> +++ llvm/trunk/lib/Analysis/ScalarEvolutionExpander.cpp Mon Nov 2
>>> 08:41:02 2015
>>>> @@ -1944,6 +1944,43 @@ bool SCEVExpander::isHighCostExpansionHe
>>>> return false;
>>>> }
>>>>
>>>> +Value *SCEVExpander::expandCodeForPredicate(const SCEVPredicate
>>> *Pred,
>>>> + Instruction *IP) {
>>>> + assert(IP);
>>>> + switch (Pred->getKind()) {
>>>> + case SCEVPredicate::P_Union:
>>>> + return expandUnionPredicate(cast<SCEVUnionPredicate>(Pred), IP);
>>>> + case SCEVPredicate::P_Equal:
>>>> + return expandEqualPredicate(cast<SCEVEqualPredicate>(Pred), IP);
>>>> + }
>>>> + llvm_unreachable("Unknown SCEV predicate type");
>>>> +}
>>>> +
>>>> +Value *SCEVExpander::expandEqualPredicate(const SCEVEqualPredicate
>>> *Pred,
>>>> + Instruction *IP) {
>>>> + Value *Expr0 = expandCodeFor(Pred->getLHS(), Pred->getLHS()-
>>>> getType(), IP);
>>>> + Value *Expr1 = expandCodeFor(Pred->getRHS(), Pred->getRHS()-
>>>> getType(), IP);
>>>> +
>>>> + Builder.SetInsertPoint(IP);
>>>> + auto *I = Builder.CreateICmpNE(Expr0, Expr1, "ident.check");
>>>> + return I;
>>>> +}
>>>> +
>>>> +Value *SCEVExpander::expandUnionPredicate(const
>>> SCEVUnionPredicate *Union,
>>>> + Instruction *IP) {
>>>> + auto *BoolType = IntegerType::get(IP->getContext(), 1);
>>>> + Value *Check = ConstantInt::getNullValue(BoolType);
>>>> +
>>>> + // Loop over all checks in this set.
>>>> + for (auto Pred : Union->getPredicates()) {
>>>> + auto *NextCheck = expandCodeForPredicate(Pred, IP);
>>>> + Builder.SetInsertPoint(IP);
>>>> + Check = Builder.CreateOr(Check, NextCheck);
>>>> + }
>>>> +
>>>> + return Check;
>>>> +}
>>>> +
>>>> namespace {
>>>> // Search for a SCEV subexpression that is not safe to expand. Any
>>> expression
>>>> // that may expand to a !isSafeToSpeculativelyExecute value is unsafe,
>>> namely
>>>>
>>>> Modified: llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp
>>>> URL: http://llvm.org/viewvc/llvm-
>>> project/llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp?rev=25180
>>> 0&r1=251799&r2=251800&view=diff
>>>>
>>> ==========================================================
>>> ====================
>>>> --- llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp (original)
>>>> +++ llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp Mon Nov 2
>>> 08:41:02 2015
>>>> @@ -222,6 +222,15 @@ static cl::opt<unsigned> PragmaVectorize
>>>> cl::desc("The maximum allowed number of runtime memory checks
>>> with a "
>>>> "vectorize(enable) pragma."));
>>>>
>>>> +static cl::opt<unsigned> VectorizeSCEVCheckThreshold(
>>>> + "vectorize-scev-check-threshold", cl::init(16), cl::Hidden,
>>>> + cl::desc("The maximum number of SCEV checks allowed."));
>>>> +
>>>> +static cl::opt<unsigned> PragmaVectorizeSCEVCheckThreshold(
>>>> + "pragma-vectorize-scev-check-threshold", cl::init(128), cl::Hidden,
>>>> + cl::desc("The maximum number of SCEV checks allowed with a "
>>>> + "vectorize(enable) pragma"));
>>>> +
>>>> namespace {
>>>>
>>>> // Forward declarations.
>>>> @@ -273,12 +282,12 @@ public:
>>>> InnerLoopVectorizer(Loop *OrigLoop, ScalarEvolution *SE, LoopInfo *LI,
>>>> DominatorTree *DT, const TargetLibraryInfo *TLI,
>>>> const TargetTransformInfo *TTI, unsigned VecWidth,
>>>> - unsigned UnrollFactor)
>>>> + unsigned UnrollFactor, SCEVUnionPredicate &Preds)
>>>> : OrigLoop(OrigLoop), SE(SE), LI(LI), DT(DT), TLI(TLI), TTI(TTI),
>>>> VF(VecWidth), UF(UnrollFactor), Builder(SE->getContext()),
>>>> Induction(nullptr), OldInduction(nullptr), WidenMap(UnrollFactor),
>>>> TripCount(nullptr), VectorTripCount(nullptr), Legal(nullptr),
>>>> - AddedSafetyChecks(false) {}
>>>> + AddedSafetyChecks(false), Preds(Preds) {}
>>>>
>>>> // Perform the actual loop widening (vectorization).
>>>> // MinimumBitWidths maps scalar integer values to the smallest bitwidth
>>> they
>>>> @@ -315,12 +324,6 @@ protected:
>>>> typedef DenseMap<std::pair<BasicBlock*, BasicBlock*>,
>>>> VectorParts> EdgeMaskCache;
>>>>
>>>> - /// \brief Add checks for strides that were assumed to be 1.
>>>> - ///
>>>> - /// Returns the last check instruction and the first check instruction in
>>> the
>>>> - /// pair as (first, last).
>>>> - std::pair<Instruction *, Instruction *> addStrideCheck(Instruction *Loc);
>>>> -
>>>> /// Create an empty loop, based on the loop ranges of the old loop.
>>>> void createEmptyLoop();
>>>> /// Create a new induction variable inside L.
>>>> @@ -404,11 +407,12 @@ protected:
>>>> void emitMinimumIterationCountCheck(Loop *L, BasicBlock *Bypass);
>>>> /// Emit a bypass check to see if the vector trip count is nonzero.
>>>> void emitVectorLoopEnteredCheck(Loop *L, BasicBlock *Bypass);
>>>> - /// Emit bypass checks to check if strides we've assumed to be one really
>>> are.
>>>> - void emitStrideChecks(Loop *L, BasicBlock *Bypass);
>>>> + /// Emit a bypass check to see if all of the SCEV assumptions we've
>>>> + /// had to make are correct.
>>>> + void emitSCEVChecks(Loop *L, BasicBlock *Bypass);
>>>> /// Emit bypass checks to check any memory assumptions we may have
>>> made.
>>>> void emitMemRuntimeChecks(Loop *L, BasicBlock *Bypass);
>>>> -
>>>> +
>>>> /// This is a helper class that holds the vectorizer state. It maps scalar
>>>> /// instructions to vector instructions. When the code is 'unrolled' then
>>>> /// then a single scalar value is mapped to multiple vector parts. The parts
>>>> @@ -516,14 +520,23 @@ protected:
>>>>
>>>> // Record whether runtime check is added.
>>>> bool AddedSafetyChecks;
>>>> +
>>>> + /// The SCEV predicate containing all the SCEV-related assumptions.
>>>> + /// The predicate is used to simplify existing expressions in the
>>>> + /// context of existing SCEV assumptions. Since legality checking is
>>>> + /// not done here, we don't need to use this predicate to record
>>>> + /// further assumptions.
>>>> + SCEVUnionPredicate &Preds;
>>>> };
>>>>
>>>> class InnerLoopUnroller : public InnerLoopVectorizer {
>>>> public:
>>>> InnerLoopUnroller(Loop *OrigLoop, ScalarEvolution *SE, LoopInfo *LI,
>>>> DominatorTree *DT, const TargetLibraryInfo *TLI,
>>>> - const TargetTransformInfo *TTI, unsigned UnrollFactor)
>>>> - : InnerLoopVectorizer(OrigLoop, SE, LI, DT, TLI, TTI, 1, UnrollFactor) {}
>>>> + const TargetTransformInfo *TTI, unsigned UnrollFactor,
>>>> + SCEVUnionPredicate &Preds)
>>>> + : InnerLoopVectorizer(OrigLoop, SE, LI, DT, TLI, TTI, 1, UnrollFactor,
>>>> + Preds) {}
>>>>
>>>> private:
>>>> void scalarizeInstruction(Instruction *Instr,
>>>> @@ -744,8 +757,9 @@ private:
>>>> /// between the member and the group in a map.
>>>> class InterleavedAccessInfo {
>>>> public:
>>>> - InterleavedAccessInfo(ScalarEvolution *SE, Loop *L, DominatorTree
>>> *DT)
>>>> - : SE(SE), TheLoop(L), DT(DT) {}
>>>> + InterleavedAccessInfo(ScalarEvolution *SE, Loop *L, DominatorTree
>>> *DT,
>>>> + SCEVUnionPredicate &Preds)
>>>> + : SE(SE), TheLoop(L), DT(DT), Preds(Preds) {}
>>>>
>>>> ~InterleavedAccessInfo() {
>>>> SmallSet<InterleaveGroup *, 4> DelSet;
>>>> @@ -779,6 +793,13 @@ private:
>>>> Loop *TheLoop;
>>>> DominatorTree *DT;
>>>>
>>>> + /// The SCEV predicate containing all the SCEV-related assumptions.
>>>> + /// The predicate is used to simplify SCEV expressions in the
>>>> + /// context of existing SCEV assumptions. The interleaved access
>>>> + /// analysis can also add new predicates (for example by versioning
>>>> + /// strides of pointers).
>>>> + SCEVUnionPredicate &Preds;
>>>> +
>>>> /// Holds the relationships between the members and the interleave
>>> group.
>>>> DenseMap<Instruction *, InterleaveGroup *> InterleaveGroupMap;
>>>>
>>>> @@ -1141,11 +1162,13 @@ public:
>>>> Function *F, const TargetTransformInfo *TTI,
>>>> LoopAccessAnalysis *LAA,
>>>> LoopVectorizationRequirements *R,
>>>> - const LoopVectorizeHints *H)
>>>> + const LoopVectorizeHints *H,
>>>> + SCEVUnionPredicate &Preds)
>>>> : NumPredStores(0), TheLoop(L), SE(SE), TLI(TLI), TheFunction(F),
>>>> - TTI(TTI), DT(DT), LAA(LAA), LAI(nullptr), InterleaveInfo(SE, L, DT),
>>>> - Induction(nullptr), WidestIndTy(nullptr), HasFunNoNaNAttr(false),
>>>> - Requirements(R), Hints(H) {}
>>>> + TTI(TTI), DT(DT), LAA(LAA), LAI(nullptr),
>>>> + InterleaveInfo(SE, L, DT, Preds), Induction(nullptr),
>>>> + WidestIndTy(nullptr), HasFunNoNaNAttr(false), Requirements(R),
>>> Hints(H),
>>>> + Preds(Preds) {}
>>>>
>>>> /// ReductionList contains the reduction descriptors for all
>>>> /// of the reductions that were found in the loop.
>>>> @@ -1344,7 +1367,14 @@ private:
>>>>
>>>> /// While vectorizing these instructions we have to generate a
>>>> /// call to the appropriate masked intrinsic
>>>> - SmallPtrSet<const Instruction*, 8> MaskedOp;
>>>> + SmallPtrSet<const Instruction *, 8> MaskedOp;
>>>> +
>>>> + /// The SCEV predicate containing all the SCEV-related assumptions.
>>>> + /// The predicate is used to simplify SCEV expressions in the
>>>> + /// context of existing SCEV assumptions. The analysis will also
>>>> + /// add a minimal set of new predicates if this is required to
>>>> + /// enable vectorization/unrolling.
>>>> + SCEVUnionPredicate &Preds;
>>>> };
>>>>
>>>> /// LoopVectorizationCostModel - estimates the expected speedups due
>>> to
>>>> @@ -1360,9 +1390,10 @@ public:
>>>> LoopVectorizationLegality *Legal,
>>>> const TargetTransformInfo &TTI,
>>>> const TargetLibraryInfo *TLI, DemandedBits *DB,
>>>> - AssumptionCache *AC,
>>>> - const Function *F, const LoopVectorizeHints *Hints,
>>>> - SmallPtrSetImpl<const Value *> &ValuesToIgnore)
>>>> + AssumptionCache *AC, const Function *F,
>>>> + const LoopVectorizeHints *Hints,
>>>> + SmallPtrSetImpl<const Value *> &ValuesToIgnore,
>>>> + SCEVUnionPredicate &Preds)
>>>> : TheLoop(L), SE(SE), LI(LI), Legal(Legal), TTI(TTI), TLI(TLI), DB(DB),
>>>> TheFunction(F), Hints(Hints), ValuesToIgnore(ValuesToIgnore) {}
>>>>
>>>> @@ -1690,10 +1721,12 @@ struct LoopVectorize : public FunctionPa
>>>> }
>>>> }
>>>>
>>>> + SCEVUnionPredicate Preds;
>>>> +
>>>> // Check if it is legal to vectorize the loop.
>>>> LoopVectorizationRequirements Requirements;
>>>> LoopVectorizationLegality LVL(L, SE, DT, TLI, AA, F, TTI, LAA,
>>>> - &Requirements, &Hints);
>>>> + &Requirements, &Hints, Preds);
>>>> if (!LVL.canVectorize()) {
>>>> DEBUG(dbgs() << "LV: Not vectorizing: Cannot prove legality.\n");
>>>> emitMissedWarning(F, L, Hints);
>>>> @@ -1712,7 +1745,7 @@ struct LoopVectorize : public FunctionPa
>>>>
>>>> // Use the cost model.
>>>> LoopVectorizationCostModel CM(L, SE, LI, &LVL, *TTI, TLI, DB, AC, F,
>>> &Hints,
>>>> - ValuesToIgnore);
>>>> + ValuesToIgnore, Preds);
>>>>
>>>> // Check the function attributes to find out if this function should be
>>>> // optimized for size.
>>>> @@ -1823,7 +1856,7 @@ struct LoopVectorize : public FunctionPa
>>>> assert(IC > 1 && "interleave count should not be 1 or 0");
>>>> // If we decided that it is not legal to vectorize the loop then
>>>> // interleave it.
>>>> - InnerLoopUnroller Unroller(L, SE, LI, DT, TLI, TTI, IC);
>>>> + InnerLoopUnroller Unroller(L, SE, LI, DT, TLI, TTI, IC, Preds);
>>>> Unroller.vectorize(&LVL, CM.MinBWs);
>>>>
>>>> emitOptimizationRemark(F->getContext(), LV_NAME, *F, L-
>>>> getStartLoc(),
>>>> @@ -1831,7 +1864,7 @@ struct LoopVectorize : public FunctionPa
>>>> Twine(IC) + ")");
>>>> } else {
>>>> // If we decided that it is *legal* to vectorize the loop then do it.
>>>> - InnerLoopVectorizer LB(L, SE, LI, DT, TLI, TTI, VF.Width, IC);
>>>> + InnerLoopVectorizer LB(L, SE, LI, DT, TLI, TTI, VF.Width, IC, Preds);
>>>> LB.vectorize(&LVL, CM.MinBWs);
>>>> ++LoopsVectorized;
>>>>
>>>> @@ -1992,7 +2025,7 @@ int LoopVectorizationLegality::isConsecu
>>>> // %idxprom = zext i32 %mul to i64 << Safe cast.
>>>> // %arrayidx = getelementptr inbounds i32* %B, i64 %idxprom
>>>> //
>>>> - Last = replaceSymbolicStrideSCEV(SE, Strides,
>>>> + Last = replaceSymbolicStrideSCEV(SE, Strides, Preds,
>>>> Gep->getOperand(InductionOperand), Gep);
>>>> if (const SCEVCastExpr *C = dyn_cast<SCEVCastExpr>(Last))
>>>> Last =
>>>> @@ -2551,56 +2584,8 @@ void InnerLoopVectorizer::scalarizeInstr
>>>> }
>>>> }
>>>>
>>>> -static Instruction *getFirstInst(Instruction *FirstInst, Value *V,
>>>> - Instruction *Loc) {
>>>> - if (FirstInst)
>>>> - return FirstInst;
>>>> - if (Instruction *I = dyn_cast<Instruction>(V))
>>>> - return I->getParent() == Loc->getParent() ? I : nullptr;
>>>> - return nullptr;
>>>> -}
>>>> -
>>>> -std::pair<Instruction *, Instruction *>
>>>> -InnerLoopVectorizer::addStrideCheck(Instruction *Loc) {
>>>> - Instruction *tnullptr = nullptr;
>>>> - if (!Legal->mustCheckStrides())
>>>> - return std::pair<Instruction *, Instruction *>(tnullptr, tnullptr);
>>>> -
>>>> - IRBuilder<> ChkBuilder(Loc);
>>>> -
>>>> - // Emit checks.
>>>> - Value *Check = nullptr;
>>>> - Instruction *FirstInst = nullptr;
>>>> - for (SmallPtrSet<Value *, 8>::iterator SI = Legal->strides_begin(),
>>>> - SE = Legal->strides_end();
>>>> - SI != SE; ++SI) {
>>>> - Value *Ptr = stripIntegerCast(*SI);
>>>> - Value *C = ChkBuilder.CreateICmpNE(Ptr, ConstantInt::get(Ptr-
>>>> getType(), 1),
>>>> - "stride.chk");
>>>> - // Store the first instruction we create.
>>>> - FirstInst = getFirstInst(FirstInst, C, Loc);
>>>> - if (Check)
>>>> - Check = ChkBuilder.CreateOr(Check, C);
>>>> - else
>>>> - Check = C;
>>>> - }
>>>> -
>>>> - // We have to do this trickery because the IRBuilder might fold the check
>>> to a
>>>> - // constant expression in which case there is no Instruction anchored in
>>> a
>>>> - // the block.
>>>> - LLVMContext &Ctx = Loc->getContext();
>>>> - Instruction *TheCheck =
>>>> - BinaryOperator::CreateAnd(Check, ConstantInt::getTrue(Ctx));
>>>> - ChkBuilder.Insert(TheCheck, "stride.not.one");
>>>> - FirstInst = getFirstInst(FirstInst, TheCheck, Loc);
>>>> -
>>>> - return std::make_pair(FirstInst, TheCheck);
>>>> -}
>>>> -
>>>> -PHINode *InnerLoopVectorizer::createInductionVariable(Loop *L,
>>>> - Value *Start,
>>>> - Value *End,
>>>> - Value *Step,
>>>> +PHINode *InnerLoopVectorizer::createInductionVariable(Loop *L, Value
>>> *Start,
>>>> + Value *End, Value *Step,
>>>> Instruction *DL) {
>>>> BasicBlock *Header = L->getHeader();
>>>> BasicBlock *Latch = L->getLoopLatch();
>>>> @@ -2735,26 +2720,26 @@ void InnerLoopVectorizer::emitVectorLoop
>>>> LoopBypassBlocks.push_back(BB);
>>>> }
>>>>
>>>> -void InnerLoopVectorizer::emitStrideChecks(Loop *L,
>>>> - BasicBlock *Bypass) {
>>>> +void InnerLoopVectorizer::emitSCEVChecks(Loop *L, BasicBlock *Bypass)
>>> {
>>>> BasicBlock *BB = L->getLoopPreheader();
>>>> -
>>>> - // Generate the code to check that the strides we assumed to be one
>>> are really
>>>> - // one. We want the new basic block to start at the first instruction in a
>>>> +
>>>> + // Generate the code to check that the SCEV assumptions that we
>>> made.
>>>> + // We want the new basic block to start at the first instruction in a
>>>> // sequence of instructions that form a check.
>>>> - Instruction *StrideCheck;
>>>> - Instruction *FirstCheckInst;
>>>> - std::tie(FirstCheckInst, StrideCheck) = addStrideCheck(BB-
>>>> getTerminator());
>>>> - if (!StrideCheck)
>>>> - return;
>>>> + SCEVExpander Exp(*SE, Bypass->getModule()->getDataLayout(),
>>> "scev.check");
>>>> + Value *SCEVCheck = Exp.expandCodeForPredicate(&Preds, BB-
>>>> getTerminator());
>>>> +
>>>> + if (auto *C = dyn_cast<ConstantInt>(SCEVCheck))
>>>> + if (C->isZero())
>>>> + return;
>>>>
>>>> // Create a new block containing the stride check.
>>>> - BB->setName("vector.stridecheck");
>>>> + BB->setName("vector.scevcheck");
>>>> auto *NewBB = BB->splitBasicBlock(BB->getTerminator(), "vector.ph");
>>>> if (L->getParentLoop())
>>>> L->getParentLoop()->addBasicBlockToLoop(NewBB, *LI);
>>>> ReplaceInstWithInst(BB->getTerminator(),
>>>> - BranchInst::Create(Bypass, NewBB, StrideCheck));
>>>> + BranchInst::Create(Bypass, NewBB, SCEVCheck));
>>>> LoopBypassBlocks.push_back(BB);
>>>> AddedSafetyChecks = true;
>>>> }
>>>> @@ -2874,10 +2859,10 @@ void InnerLoopVectorizer::createEmptyLoo
>>>> // Now, compare the new count to zero. If it is zero skip the vector loop
>>> and
>>>> // jump to the scalar loop.
>>>> emitVectorLoopEnteredCheck(Lp, ScalarPH);
>>>> - // Generate the code to check that the strides we assumed to be one
>>> are really
>>>> - // one. We want the new basic block to start at the first instruction in a
>>>> - // sequence of instructions that form a check.
>>>> - emitStrideChecks(Lp, ScalarPH);
>>>> + // Generate the code to check any assumptions that we've made for
>>> SCEV
>>>> + // expressions.
>>>> + emitSCEVChecks(Lp, ScalarPH);
>>>> +
>>>> // Generate the code that checks in runtime if arrays overlap. We put the
>>>> // checks into a separate block to make the more common case of few
>>> elements
>>>> // faster.
>>>> @@ -4130,7 +4115,19 @@ bool LoopVectorizationLegality::canVecto
>>>>
>>>> // Analyze interleaved memory accesses.
>>>> if (UseInterleaved)
>>>> - InterleaveInfo.analyzeInterleaving(Strides);
>>>> + InterleaveInfo.analyzeInterleaving(Strides);
>>>> +
>>>> + unsigned SCEVThreshold = VectorizeSCEVCheckThreshold;
>>>> + if (Hints->getForce() == LoopVectorizeHints::FK_Enabled)
>>>> + SCEVThreshold = PragmaVectorizeSCEVCheckThreshold;
>>>> +
>>>> + if (Preds.getComplexity() > SCEVThreshold) {
>>>> + emitAnalysis(VectorizationReport()
>>>> + << "Too many SCEV assumptions need to be made and checked
>>> "
>>>> + << "at runtime");
>>>> + DEBUG(dbgs() << "LV: Too many SCEV checks needed.\n");
>>>> + return false;
>>>> + }
>>>>
>>>> // Okay! We can vectorize. At this point we don't have any other mem
>>> analysis
>>>> // which may limit our maximum vectorization factor, so just return true
>>> with
>>>> @@ -4436,6 +4433,7 @@ bool LoopVectorizationLegality::canVecto
>>>> }
>>>>
>>>> Requirements->addRuntimePointerChecks(LAI-
>>>> getNumRuntimePointerChecks());
>>>> + Preds.add(&LAI->Preds);
>>>>
>>>> return true;
>>>> }
>>>> @@ -4550,7 +4548,7 @@ void InterleavedAccessInfo::collectConst
>>>> StoreInst *SI = dyn_cast<StoreInst>(I);
>>>>
>>>> Value *Ptr = LI ? LI->getPointerOperand() : SI->getPointerOperand();
>>>> - int Stride = isStridedPtr(SE, Ptr, TheLoop, Strides);
>>>> + int Stride = isStridedPtr(SE, Ptr, TheLoop, Strides, Preds);
>>>>
>>>> // The factor of the corresponding interleave group.
>>>> unsigned Factor = std::abs(Stride);
>>>> @@ -4559,7 +4557,7 @@ void InterleavedAccessInfo::collectConst
>>>> if (Factor < 2 || Factor > MaxInterleaveGroupFactor)
>>>> continue;
>>>>
>>>> - const SCEV *Scev = replaceSymbolicStrideSCEV(SE, Strides, Ptr);
>>>> + const SCEV *Scev = replaceSymbolicStrideSCEV(SE, Strides, Preds, Ptr);
>>>> PointerType *PtrTy = dyn_cast<PointerType>(Ptr->getType());
>>>> unsigned Size = DL.getTypeAllocSize(PtrTy->getElementType());
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
>
> ________________________________
>
> -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151105/3af0fa1b/attachment-0001.html>
More information about the llvm-commits
mailing list