[llvm] r254805 - [EarlyCSE] IsSimple vs IsVolatile naming clarification (NFC)

Fri Dec 4 18:37:29 PST 2015

So what is the distinction between "split" and "sheared"? Is "split"
basically a special case of "sheared"?

On Fri, Dec 4, 2015 at 6:32 PM, Philip Reames <listmail at philipreames.com>
wrote:

>
>
> On 12/04/2015 06:16 PM, Sean Silva wrote:
>
>
>
> On Fri, Dec 4, 2015 at 4:18 PM, Philip Reames via llvm-commits <
> <llvm-commits at lists.llvm.org>llvm-commits at lists.llvm.org> wrote:
>
>> Author: reames
>> Date: Fri Dec  4 18:18:33 2015
>> New Revision: 254805
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=254805&view=rev
>> Log:
>> [EarlyCSE] IsSimple vs IsVolatile naming clarification (NFC)
>>
>> When the notion of target specific memory intrinsics was introduced to
>> EarlyCSE, the commit confused the notions of volatile and simple memory
>> access.  Since I'm about to start working on this area, cleanup the naming
>> so that patches aren't horribly confusing.  Note that the actual
>> implementation was always bailing if the load or store wasn't simple.
>>
>> Reminder:
>> - "volatile" - C++ volatile, can't remove any memory operations, but in
>> principal unordered
>> - "ordered" - imposes ordering constraints on other nearby memory
>> operations
>> - "atomic" - can't be split or sheared.  In LLVM terms, all "ordered"
>> operations are also atomic so the predicate "isAtomic" is often used.
>>
>
> What does "sheared" mean in this context? Widened?
>
> Any operation which can result in less than the original bit-width being
> issued all the way through the memory system.  Though, looking around, I
> can't find a precise quotable definition either.  I'll have to dig a bit
> further.  In practice, the following types of things tend to be problematic:
>
> Splitting is a problem - so, you can't value forward half a load and
> reload the other half.  Nor can you do two i32 loads to build an i64.  On
> some platforms, this implies a major cost.
>
> Widening would be a problem if the widened load was then split without
> restoring the atomic marker.  Widening or merging is not inherently
> problematic assuming all the individual constraints are obeyed.  (Treating
> widening as canonicalization becomes quite a bit harder though due to
> profitability questions.)
>
> RMW operations have to be performed at the original bitwidth.  i.e. an OR
> to set the low bit can't be reduced from i64 to i8.  This is essentially
> just a special application of splitting.  It also implies that narrow the
> bitwidth of any chain of operations has to stop at the memory access.
>
> Depending on your hardware, you might need to worry about splitting done
> within the hardware itself.  For instance, some split vector loads into two
> cycles and don't ensure atomicity of the two reads.  So, even if you issue
> the appropriate load type, you might get a sheared value.  x86 is fairly
> lenient on this.  Other architectures, not so much.
>
>
> (Fair warning - I wrote the above off the cuff without thinking it through
> in detail - that pretty much guarantees I'm wrong about _something_ in the
> above.)
>
>
>
>
> -- Sean Silva
>
>
>> - "simple" - a load which is none of the above.  These are normal loads
>> and what most of the optimizer works with.
>>
>>
>> Modified:
>>     llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
>>     llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
>>     llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp
>>
>> Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h?rev=254805&r1=254804&r2=254805&view=diff
>>
>> ==============================================================================
>> --- llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h (original)
>> +++ llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h Fri Dec  4
>> 18:18:33 2015
>> @@ -42,11 +42,13 @@ class Value;
>>  /// \brief Information about a load/store intrinsic defined by the
>> target.
>>  struct MemIntrinsicInfo {
>>    MemIntrinsicInfo()
>> -      : ReadMem(false), WriteMem(false), Vol(false), MatchingId(0),
>> +      : ReadMem(false), WriteMem(false), IsSimple(false), MatchingId(0),
>>          NumMemRefs(0), PtrVal(nullptr) {}
>>    bool ReadMem;
>>    bool WriteMem;
>> -  bool Vol;
>> +  /// True only if this memory operation is non-volatile, non-atomic, and
>> +  /// unordered.  (See LoadInst/StoreInst for details on each)
>> +  bool IsSimple;
>>    // Same Id is set by the target for corresponding load/store
>> intrinsics.
>>    unsigned short MatchingId;
>>    int NumMemRefs;
>>
>> Modified: llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp?rev=254805&r1=254804&r2=254805&view=diff
>>
>> ==============================================================================
>> --- llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
>> (original)
>> +++ llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp Fri Dec
>> 4 18:18:33 2015
>> @@ -538,7 +538,7 @@ bool AArch64TTIImpl::getTgtMemIntrinsic(
>>    case Intrinsic::aarch64_neon_ld4:
>>      Info.ReadMem = true;
>>      Info.WriteMem = false;
>> -    Info.Vol = false;
>> +    Info.IsSimple = true;
>>      Info.NumMemRefs = 1;
>>      Info.PtrVal = Inst->getArgOperand(0);
>>      break;
>> @@ -547,7 +547,7 @@ bool AArch64TTIImpl::getTgtMemIntrinsic(
>>    case Intrinsic::aarch64_neon_st4:
>>      Info.ReadMem = false;
>>      Info.WriteMem = true;
>> -    Info.Vol = false;
>> +    Info.IsSimple = true;
>>      Info.NumMemRefs = 1;
>>      Info.PtrVal = Inst->getArgOperand(Inst->getNumArgOperands() - 1);
>>      break;
>>
>> Modified: llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp?rev=254805&r1=254804&r2=254805&view=diff
>>
>> ==============================================================================
>> --- llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp (original)
>> +++ llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp Fri Dec  4 18:18:33 2015
>> @@ -388,8 +388,8 @@ private:
>>    class ParseMemoryInst {
>>    public:
>>      ParseMemoryInst(Instruction *Inst, const TargetTransformInfo &TTI)
>> -        : Load(false), Store(false), Vol(false),
>> MayReadFromMemory(false),
>> -          MayWriteToMemory(false), MatchingId(-1), Ptr(nullptr) {
>> +      : Load(false), Store(false), IsSimple(true),
>> MayReadFromMemory(false),
>> +        MayWriteToMemory(false), MatchingId(-1), Ptr(nullptr) {
>>        MayReadFromMemory = Inst->mayReadFromMemory();
>>        MayWriteToMemory = Inst->mayWriteToMemory();
>>        if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(Inst)) {
>> @@ -402,22 +402,22 @@ private:
>>            MatchingId = Info.MatchingId;
>>            MayReadFromMemory = Info.ReadMem;
>>            MayWriteToMemory = Info.WriteMem;
>> -          Vol = Info.Vol;
>> +          IsSimple = Info.IsSimple;
>>            Ptr = Info.PtrVal;
>>          }
>>        } else if (LoadInst *LI = dyn_cast<LoadInst>(Inst)) {
>>          Load = true;
>> -        Vol = !LI->isSimple();
>> +        IsSimple = LI->isSimple();
>>          Ptr = LI->getPointerOperand();
>>        } else if (StoreInst *SI = dyn_cast<StoreInst>(Inst)) {
>>          Store = true;
>> -        Vol = !SI->isSimple();
>> +        IsSimple = SI->isSimple();
>>          Ptr = SI->getPointerOperand();
>>        }
>>      }
>>      bool isLoad() const { return Load; }
>>      bool isStore() const { return Store; }
>> -    bool isVolatile() const { return Vol; }
>> +    bool isSimple() const { return IsSimple; }
>>      bool isMatchingMemLoc(const ParseMemoryInst &Inst) const {
>>        return Ptr == Inst.Ptr && MatchingId == Inst.MatchingId;
>>      }
>> @@ -430,7 +430,7 @@ private:
>>    private:
>>      bool Load;
>>      bool Store;
>> -    bool Vol;
>> +    bool IsSimple;
>>      bool MayReadFromMemory;
>>      bool MayWriteToMemory;
>>      // For regular (non-intrinsic) loads/stores, this is set to -1. For
>> @@ -554,8 +554,8 @@ bool EarlyCSE::processNode(DomTreeNode *
>>      ParseMemoryInst MemInst(Inst, TTI);
>>      // If this is a non-volatile load, process it.
>>      if (MemInst.isValid() && MemInst.isLoad()) {
>> -      // Ignore volatile loads.
>> -      if (MemInst.isVolatile()) {
>> +      // Ignore volatile or ordered loads.
>> +      if (!MemInst.isSimple()) {
>>          LastStore = nullptr;
>>          // Don't CSE across synchronization boundaries.
>>          if (Inst->mayWriteToMemory())
>> @@ -662,8 +662,8 @@ bool EarlyCSE::processNode(DomTreeNode *
>>              MemInst.getPtr(),
>>              LoadValue(Inst, CurrentGeneration, MemInst.getMatchingId()));
>>
>> -        // Remember that this was the last store we saw for DSE.
>> -        if (!MemInst.isVolatile())
>> +        // Remember that this was the last normal store we saw for DSE.
>> +        if (MemInst.isSimple())
>>            LastStore = Inst;
>>        }
>>      }
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151204/b2f5cce1/attachment.html>