[llvm] r254805 - [EarlyCSE] IsSimple vs IsVolatile naming clarification (NFC)

Fri Dec 4 18:38:50 PST 2015

As I was using them informal, yes.  My original text wasn't intend to be 
overly precise.  :)

Philip

On 12/04/2015 06:37 PM, Sean Silva wrote:
> So what is the distinction between "split" and "sheared"? Is "split" 
> basically a special case of "sheared"?
>
> On Fri, Dec 4, 2015 at 6:32 PM, Philip Reames 
> <listmail at philipreames.com <mailto:listmail at philipreames.com>> wrote:
>
>
>
>     On 12/04/2015 06:16 PM, Sean Silva wrote:
>>
>>
>>     On Fri, Dec 4, 2015 at 4:18 PM, Philip Reames via llvm-commits
>>     <llvm-commits at lists.llvm.org
>>     <mailto:llvm-commits at lists.llvm.org>> wrote:
>>
>>         Author: reames
>>         Date: Fri Dec  4 18:18:33 2015
>>         New Revision: 254805
>>
>>         URL: http://llvm.org/viewvc/llvm-project?rev=254805&view=rev
>>         Log:
>>         [EarlyCSE] IsSimple vs IsVolatile naming clarification (NFC)
>>
>>         When the notion of target specific memory intrinsics was
>>         introduced to EarlyCSE, the commit confused the notions of
>>         volatile and simple memory access.  Since I'm about to start
>>         working on this area, cleanup the naming so that patches
>>         aren't horribly confusing. Note that the actual
>>         implementation was always bailing if the load or store wasn't
>>         simple.
>>
>>         Reminder:
>>         - "volatile" - C++ volatile, can't remove any memory
>>         operations, but in principal unordered
>>         - "ordered" - imposes ordering constraints on other nearby
>>         memory operations
>>         - "atomic" - can't be split or sheared.  In LLVM terms, all
>>         "ordered" operations are also atomic so the predicate
>>         "isAtomic" is often used.
>>
>>
>>     What does "sheared" mean in this context? Widened?
>     Any operation which can result in less than the original bit-width
>     being issued all the way through the memory system.  Though,
>     looking around, I can't find a precise quotable definition
>     either.  I'll have to dig a bit further.  In practice, the
>     following types of things tend to be problematic:
>
>     Splitting is a problem - so, you can't value forward half a load
>     and reload the other half.  Nor can you do two i32 loads to build
>     an i64.  On some platforms, this implies a major cost.
>
>     Widening would be a problem if the widened load was then split
>     without restoring the atomic marker.  Widening or merging is not
>     inherently problematic assuming all the individual constraints are
>     obeyed.  (Treating widening as canonicalization becomes quite a
>     bit harder though due to profitability questions.)
>
>     RMW operations have to be performed at the original bitwidth. 
>     i.e. an OR to set the low bit can't be reduced from i64 to i8. 
>     This is essentially just a special application of splitting.  It
>     also implies that narrow the bitwidth of any chain of operations
>     has to stop at the memory access.
>
>     Depending on your hardware, you might need to worry about
>     splitting done within the hardware itself.  For instance, some
>     split vector loads into two cycles and don't ensure atomicity of
>     the two reads.  So, even if you issue the appropriate load type,
>     you might get a sheared value.  x86 is fairly lenient on this. 
>     Other architectures, not so much.
>
>
>     (Fair warning - I wrote the above off the cuff without thinking it
>     through in detail - that pretty much guarantees I'm wrong about
>     _something_ in the above.)
>
>
>
>>
>>     -- Sean Silva
>>
>>         - "simple" - a load which is none of the above.  These are
>>         normal loads and what most of the optimizer works with.
>>
>>
>>         Modified:
>>         llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
>>         llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
>>         llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp
>>
>>         Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
>>         URL:
>>         http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h?rev=254805&r1=254804&r2=254805&view=diff
>>         ==============================================================================
>>         --- llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
>>         (original)
>>         +++ llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
>>         Fri Dec  4 18:18:33 2015
>>         @@ -42,11 +42,13 @@ class Value;
>>          /// \brief Information about a load/store intrinsic defined
>>         by the target.
>>          struct MemIntrinsicInfo {
>>            MemIntrinsicInfo()
>>         -      : ReadMem(false), WriteMem(false), Vol(false),
>>         MatchingId(0),
>>         +      : ReadMem(false), WriteMem(false), IsSimple(false),
>>         MatchingId(0),
>>                  NumMemRefs(0), PtrVal(nullptr) {}
>>            bool ReadMem;
>>            bool WriteMem;
>>         -  bool Vol;
>>         +  /// True only if this memory operation is non-volatile,
>>         non-atomic, and
>>         +  /// unordered.  (See LoadInst/StoreInst for details on each)
>>         +  bool IsSimple;
>>            // Same Id is set by the target for corresponding
>>         load/store intrinsics.
>>            unsigned short MatchingId;
>>            int NumMemRefs;
>>
>>         Modified:
>>         llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
>>         URL:
>>         http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp?rev=254805&r1=254804&r2=254805&view=diff
>>         ==============================================================================
>>         ---
>>         llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
>>         (original)
>>         +++
>>         llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
>>         Fri Dec  4 18:18:33 2015
>>         @@ -538,7 +538,7 @@ bool AArch64TTIImpl::getTgtMemIntrinsic(
>>            case Intrinsic::aarch64_neon_ld4:
>>              Info.ReadMem = true;
>>              Info.WriteMem = false;
>>         -    Info.Vol = false;
>>         +    Info.IsSimple = true;
>>              Info.NumMemRefs = 1;
>>              Info.PtrVal = Inst->getArgOperand(0);
>>              break;
>>         @@ -547,7 +547,7 @@ bool AArch64TTIImpl::getTgtMemIntrinsic(
>>            case Intrinsic::aarch64_neon_st4:
>>              Info.ReadMem = false;
>>              Info.WriteMem = true;
>>         -    Info.Vol = false;
>>         +    Info.IsSimple = true;
>>              Info.NumMemRefs = 1;
>>              Info.PtrVal =
>>         Inst->getArgOperand(Inst->getNumArgOperands() - 1);
>>              break;
>>
>>         Modified: llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp
>>         URL:
>>         http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp?rev=254805&r1=254804&r2=254805&view=diff
>>         ==============================================================================
>>         --- llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp (original)
>>         +++ llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp Fri Dec  4
>>         18:18:33 2015
>>         @@ -388,8 +388,8 @@ private:
>>            class ParseMemoryInst {
>>            public:
>>              ParseMemoryInst(Instruction *Inst, const
>>         TargetTransformInfo &TTI)
>>         -        : Load(false), Store(false), Vol(false),
>>         MayReadFromMemory(false),
>>         -          MayWriteToMemory(false), MatchingId(-1),
>>         Ptr(nullptr) {
>>         +      : Load(false), Store(false), IsSimple(true),
>>         MayReadFromMemory(false),
>>         +        MayWriteToMemory(false), MatchingId(-1), Ptr(nullptr) {
>>                MayReadFromMemory = Inst->mayReadFromMemory();
>>                MayWriteToMemory = Inst->mayWriteToMemory();
>>                if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(Inst)) {
>>         @@ -402,22 +402,22 @@ private:
>>                    MatchingId = Info.MatchingId;
>>                    MayReadFromMemory = Info.ReadMem;
>>                    MayWriteToMemory = Info.WriteMem;
>>         -          Vol = Info.Vol;
>>         +          IsSimple = Info.IsSimple;
>>                    Ptr = Info.PtrVal;
>>                  }
>>                } else if (LoadInst *LI = dyn_cast<LoadInst>(Inst)) {
>>                  Load = true;
>>         -        Vol = !LI->isSimple();
>>         +        IsSimple = LI->isSimple();
>>                  Ptr = LI->getPointerOperand();
>>                } else if (StoreInst *SI = dyn_cast<StoreInst>(Inst)) {
>>                  Store = true;
>>         -        Vol = !SI->isSimple();
>>         +        IsSimple = SI->isSimple();
>>                  Ptr = SI->getPointerOperand();
>>                }
>>              }
>>              bool isLoad() const { return Load; }
>>              bool isStore() const { return Store; }
>>         -    bool isVolatile() const { return Vol; }
>>         +    bool isSimple() const { return IsSimple; }
>>              bool isMatchingMemLoc(const ParseMemoryInst &Inst) const {
>>                return Ptr == Inst.Ptr && MatchingId == Inst.MatchingId;
>>              }
>>         @@ -430,7 +430,7 @@ private:
>>            private:
>>              bool Load;
>>              bool Store;
>>         -    bool Vol;
>>         +    bool IsSimple;
>>              bool MayReadFromMemory;
>>              bool MayWriteToMemory;
>>              // For regular (non-intrinsic) loads/stores, this is set
>>         to -1. For
>>         @@ -554,8 +554,8 @@ bool EarlyCSE::processNode(DomTreeNode *
>>              ParseMemoryInst MemInst(Inst, TTI);
>>              // If this is a non-volatile load, process it.
>>              if (MemInst.isValid() && MemInst.isLoad()) {
>>         -      // Ignore volatile loads.
>>         -      if (MemInst.isVolatile()) {
>>         +      // Ignore volatile or ordered loads.
>>         +      if (!MemInst.isSimple()) {
>>                  LastStore = nullptr;
>>                  // Don't CSE across synchronization boundaries.
>>                  if (Inst->mayWriteToMemory())
>>         @@ -662,8 +662,8 @@ bool EarlyCSE::processNode(DomTreeNode *
>>                      MemInst.getPtr(),
>>                      LoadValue(Inst, CurrentGeneration,
>>         MemInst.getMatchingId()));
>>
>>         -        // Remember that this was the last store we saw for DSE.
>>         -        if (!MemInst.isVolatile())
>>         +        // Remember that this was the last normal store we
>>         saw for DSE.
>>         +        if (MemInst.isSimple())
>>                    LastStore = Inst;
>>                }
>>              }
>>
>>
>>         _______________________________________________
>>         llvm-commits mailing list
>>         llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>
>>         http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151204/ed12f6b4/attachment.html>