[llvm] r254805 - [EarlyCSE] IsSimple vs IsVolatile naming clarification (NFC)

Fri Dec 4 18:32:34 PST 2015

On 12/04/2015 06:16 PM, Sean Silva wrote:
>
>
> On Fri, Dec 4, 2015 at 4:18 PM, Philip Reames via llvm-commits 
> <llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>> wrote:
>
>     Author: reames
>     Date: Fri Dec  4 18:18:33 2015
>     New Revision: 254805
>
>     URL: http://llvm.org/viewvc/llvm-project?rev=254805&view=rev
>     Log:
>     [EarlyCSE] IsSimple vs IsVolatile naming clarification (NFC)
>
>     When the notion of target specific memory intrinsics was
>     introduced to EarlyCSE, the commit confused the notions of
>     volatile and simple memory access.  Since I'm about to start
>     working on this area, cleanup the naming so that patches aren't
>     horribly confusing.  Note that the actual implementation was
>     always bailing if the load or store wasn't simple.
>
>     Reminder:
>     - "volatile" - C++ volatile, can't remove any memory operations,
>     but in principal unordered
>     - "ordered" - imposes ordering constraints on other nearby memory
>     operations
>     - "atomic" - can't be split or sheared.  In LLVM terms, all
>     "ordered" operations are also atomic so the predicate "isAtomic"
>     is often used.
>
>
> What does "sheared" mean in this context? Widened?
Any operation which can result in less than the original bit-width being 
issued all the way through the memory system.  Though, looking around, I 
can't find a precise quotable definition either.  I'll have to dig a bit 
further.  In practice, the following types of things tend to be problematic:

Splitting is a problem - so, you can't value forward half a load and 
reload the other half.  Nor can you do two i32 loads to build an i64.  
On some platforms, this implies a major cost.

Widening would be a problem if the widened load was then split without 
restoring the atomic marker.  Widening or merging is not inherently 
problematic assuming all the individual constraints are obeyed.  
(Treating widening as canonicalization becomes quite a bit harder though 
due to profitability questions.)

RMW operations have to be performed at the original bitwidth.  i.e. an 
OR to set the low bit can't be reduced from i64 to i8.  This is 
essentially just a special application of splitting.  It also implies 
that narrow the bitwidth of any chain of operations has to stop at the 
memory access.

Depending on your hardware, you might need to worry about splitting done 
within the hardware itself.  For instance, some split vector loads into 
two cycles and don't ensure atomicity of the two reads. So, even if you 
issue the appropriate load type, you might get a sheared value.  x86 is 
fairly lenient on this.  Other architectures, not so much.

(Fair warning - I wrote the above off the cuff without thinking it 
through in detail - that pretty much guarantees I'm wrong about 
_something_ in the above.)

>
> -- Sean Silva
>
>     - "simple" - a load which is none of the above.  These are normal
>     loads and what most of the optimizer works with.
>
>
>     Modified:
>         llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
>     llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
>         llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp
>
>     Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
>     URL:
>     http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h?rev=254805&r1=254804&r2=254805&view=diff
>     ==============================================================================
>     --- llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h (original)
>     +++ llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h Fri
>     Dec  4 18:18:33 2015
>     @@ -42,11 +42,13 @@ class Value;
>      /// \brief Information about a load/store intrinsic defined by
>     the target.
>      struct MemIntrinsicInfo {
>        MemIntrinsicInfo()
>     -      : ReadMem(false), WriteMem(false), Vol(false), MatchingId(0),
>     +      : ReadMem(false), WriteMem(false), IsSimple(false),
>     MatchingId(0),
>              NumMemRefs(0), PtrVal(nullptr) {}
>        bool ReadMem;
>        bool WriteMem;
>     -  bool Vol;
>     +  /// True only if this memory operation is non-volatile,
>     non-atomic, and
>     +  /// unordered.  (See LoadInst/StoreInst for details on each)
>     +  bool IsSimple;
>        // Same Id is set by the target for corresponding load/store
>     intrinsics.
>        unsigned short MatchingId;
>        int NumMemRefs;
>
>     Modified: llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
>     URL:
>     http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp?rev=254805&r1=254804&r2=254805&view=diff
>     ==============================================================================
>     --- llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
>     (original)
>     +++ llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
>     Fri Dec  4 18:18:33 2015
>     @@ -538,7 +538,7 @@ bool AArch64TTIImpl::getTgtMemIntrinsic(
>        case Intrinsic::aarch64_neon_ld4:
>          Info.ReadMem = true;
>          Info.WriteMem = false;
>     -    Info.Vol = false;
>     +    Info.IsSimple = true;
>          Info.NumMemRefs = 1;
>          Info.PtrVal = Inst->getArgOperand(0);
>          break;
>     @@ -547,7 +547,7 @@ bool AArch64TTIImpl::getTgtMemIntrinsic(
>        case Intrinsic::aarch64_neon_st4:
>          Info.ReadMem = false;
>          Info.WriteMem = true;
>     -    Info.Vol = false;
>     +    Info.IsSimple = true;
>          Info.NumMemRefs = 1;
>          Info.PtrVal = Inst->getArgOperand(Inst->getNumArgOperands() - 1);
>          break;
>
>     Modified: llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp
>     URL:
>     http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp?rev=254805&r1=254804&r2=254805&view=diff
>     ==============================================================================
>     --- llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp (original)
>     +++ llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp Fri Dec 4
>     18:18:33 2015
>     @@ -388,8 +388,8 @@ private:
>        class ParseMemoryInst {
>        public:
>          ParseMemoryInst(Instruction *Inst, const TargetTransformInfo
>     &TTI)
>     -        : Load(false), Store(false), Vol(false),
>     MayReadFromMemory(false),
>     -          MayWriteToMemory(false), MatchingId(-1), Ptr(nullptr) {
>     +      : Load(false), Store(false), IsSimple(true),
>     MayReadFromMemory(false),
>     +        MayWriteToMemory(false), MatchingId(-1), Ptr(nullptr) {
>            MayReadFromMemory = Inst->mayReadFromMemory();
>            MayWriteToMemory = Inst->mayWriteToMemory();
>            if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(Inst)) {
>     @@ -402,22 +402,22 @@ private:
>                MatchingId = Info.MatchingId;
>                MayReadFromMemory = Info.ReadMem;
>                MayWriteToMemory = Info.WriteMem;
>     -          Vol = Info.Vol;
>     +          IsSimple = Info.IsSimple;
>                Ptr = Info.PtrVal;
>              }
>            } else if (LoadInst *LI = dyn_cast<LoadInst>(Inst)) {
>              Load = true;
>     -        Vol = !LI->isSimple();
>     +        IsSimple = LI->isSimple();
>              Ptr = LI->getPointerOperand();
>            } else if (StoreInst *SI = dyn_cast<StoreInst>(Inst)) {
>              Store = true;
>     -        Vol = !SI->isSimple();
>     +        IsSimple = SI->isSimple();
>              Ptr = SI->getPointerOperand();
>            }
>          }
>          bool isLoad() const { return Load; }
>          bool isStore() const { return Store; }
>     -    bool isVolatile() const { return Vol; }
>     +    bool isSimple() const { return IsSimple; }
>          bool isMatchingMemLoc(const ParseMemoryInst &Inst) const {
>            return Ptr == Inst.Ptr && MatchingId == Inst.MatchingId;
>          }
>     @@ -430,7 +430,7 @@ private:
>        private:
>          bool Load;
>          bool Store;
>     -    bool Vol;
>     +    bool IsSimple;
>          bool MayReadFromMemory;
>          bool MayWriteToMemory;
>          // For regular (non-intrinsic) loads/stores, this is set to
>     -1. For
>     @@ -554,8 +554,8 @@ bool EarlyCSE::processNode(DomTreeNode *
>          ParseMemoryInst MemInst(Inst, TTI);
>          // If this is a non-volatile load, process it.
>          if (MemInst.isValid() && MemInst.isLoad()) {
>     -      // Ignore volatile loads.
>     -      if (MemInst.isVolatile()) {
>     +      // Ignore volatile or ordered loads.
>     +      if (!MemInst.isSimple()) {
>              LastStore = nullptr;
>              // Don't CSE across synchronization boundaries.
>              if (Inst->mayWriteToMemory())
>     @@ -662,8 +662,8 @@ bool EarlyCSE::processNode(DomTreeNode *
>                  MemInst.getPtr(),
>                  LoadValue(Inst, CurrentGeneration,
>     MemInst.getMatchingId()));
>
>     -        // Remember that this was the last store we saw for DSE.
>     -        if (!MemInst.isVolatile())
>     +        // Remember that this was the last normal store we saw
>     for DSE.
>     +        if (MemInst.isSimple())
>                LastStore = Inst;
>            }
>          }
>
>
>     _______________________________________________
>     llvm-commits mailing list
>     llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151204/0df6cbee/attachment.html>