[llvm] r254805 - [EarlyCSE] IsSimple vs IsVolatile naming clarification (NFC)
Sean Silva via llvm-commits
llvm-commits at lists.llvm.org
Fri Dec 4 18:37:29 PST 2015
So what is the distinction between "split" and "sheared"? Is "split"
basically a special case of "sheared"?
On Fri, Dec 4, 2015 at 6:32 PM, Philip Reames <listmail at philipreames.com>
wrote:
>
>
> On 12/04/2015 06:16 PM, Sean Silva wrote:
>
>
>
> On Fri, Dec 4, 2015 at 4:18 PM, Philip Reames via llvm-commits <
> <llvm-commits at lists.llvm.org>llvm-commits at lists.llvm.org> wrote:
>
>> Author: reames
>> Date: Fri Dec 4 18:18:33 2015
>> New Revision: 254805
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=254805&view=rev
>> Log:
>> [EarlyCSE] IsSimple vs IsVolatile naming clarification (NFC)
>>
>> When the notion of target specific memory intrinsics was introduced to
>> EarlyCSE, the commit confused the notions of volatile and simple memory
>> access. Since I'm about to start working on this area, cleanup the naming
>> so that patches aren't horribly confusing. Note that the actual
>> implementation was always bailing if the load or store wasn't simple.
>>
>> Reminder:
>> - "volatile" - C++ volatile, can't remove any memory operations, but in
>> principal unordered
>> - "ordered" - imposes ordering constraints on other nearby memory
>> operations
>> - "atomic" - can't be split or sheared. In LLVM terms, all "ordered"
>> operations are also atomic so the predicate "isAtomic" is often used.
>>
>
> What does "sheared" mean in this context? Widened?
>
> Any operation which can result in less than the original bit-width being
> issued all the way through the memory system. Though, looking around, I
> can't find a precise quotable definition either. I'll have to dig a bit
> further. In practice, the following types of things tend to be problematic:
>
> Splitting is a problem - so, you can't value forward half a load and
> reload the other half. Nor can you do two i32 loads to build an i64. On
> some platforms, this implies a major cost.
>
> Widening would be a problem if the widened load was then split without
> restoring the atomic marker. Widening or merging is not inherently
> problematic assuming all the individual constraints are obeyed. (Treating
> widening as canonicalization becomes quite a bit harder though due to
> profitability questions.)
>
> RMW operations have to be performed at the original bitwidth. i.e. an OR
> to set the low bit can't be reduced from i64 to i8. This is essentially
> just a special application of splitting. It also implies that narrow the
> bitwidth of any chain of operations has to stop at the memory access.
>
> Depending on your hardware, you might need to worry about splitting done
> within the hardware itself. For instance, some split vector loads into two
> cycles and don't ensure atomicity of the two reads. So, even if you issue
> the appropriate load type, you might get a sheared value. x86 is fairly
> lenient on this. Other architectures, not so much.
>
>
> (Fair warning - I wrote the above off the cuff without thinking it through
> in detail - that pretty much guarantees I'm wrong about _something_ in the
> above.)
>
>
>
>
> -- Sean Silva
>
>
>> - "simple" - a load which is none of the above. These are normal loads
>> and what most of the optimizer works with.
>>
>>
>> Modified:
>> llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
>> llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
>> llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp
>>
>> Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h?rev=254805&r1=254804&r2=254805&view=diff
>>
>> ==============================================================================
>> --- llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h (original)
>> +++ llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h Fri Dec 4
>> 18:18:33 2015
>> @@ -42,11 +42,13 @@ class Value;
>> /// \brief Information about a load/store intrinsic defined by the
>> target.
>> struct MemIntrinsicInfo {
>> MemIntrinsicInfo()
>> - : ReadMem(false), WriteMem(false), Vol(false), MatchingId(0),
>> + : ReadMem(false), WriteMem(false), IsSimple(false), MatchingId(0),
>> NumMemRefs(0), PtrVal(nullptr) {}
>> bool ReadMem;
>> bool WriteMem;
>> - bool Vol;
>> + /// True only if this memory operation is non-volatile, non-atomic, and
>> + /// unordered. (See LoadInst/StoreInst for details on each)
>> + bool IsSimple;
>> // Same Id is set by the target for corresponding load/store
>> intrinsics.
>> unsigned short MatchingId;
>> int NumMemRefs;
>>
>> Modified: llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp?rev=254805&r1=254804&r2=254805&view=diff
>>
>> ==============================================================================
>> --- llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
>> (original)
>> +++ llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp Fri Dec
>> 4 18:18:33 2015
>> @@ -538,7 +538,7 @@ bool AArch64TTIImpl::getTgtMemIntrinsic(
>> case Intrinsic::aarch64_neon_ld4:
>> Info.ReadMem = true;
>> Info.WriteMem = false;
>> - Info.Vol = false;
>> + Info.IsSimple = true;
>> Info.NumMemRefs = 1;
>> Info.PtrVal = Inst->getArgOperand(0);
>> break;
>> @@ -547,7 +547,7 @@ bool AArch64TTIImpl::getTgtMemIntrinsic(
>> case Intrinsic::aarch64_neon_st4:
>> Info.ReadMem = false;
>> Info.WriteMem = true;
>> - Info.Vol = false;
>> + Info.IsSimple = true;
>> Info.NumMemRefs = 1;
>> Info.PtrVal = Inst->getArgOperand(Inst->getNumArgOperands() - 1);
>> break;
>>
>> Modified: llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp?rev=254805&r1=254804&r2=254805&view=diff
>>
>> ==============================================================================
>> --- llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp (original)
>> +++ llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp Fri Dec 4 18:18:33 2015
>> @@ -388,8 +388,8 @@ private:
>> class ParseMemoryInst {
>> public:
>> ParseMemoryInst(Instruction *Inst, const TargetTransformInfo &TTI)
>> - : Load(false), Store(false), Vol(false),
>> MayReadFromMemory(false),
>> - MayWriteToMemory(false), MatchingId(-1), Ptr(nullptr) {
>> + : Load(false), Store(false), IsSimple(true),
>> MayReadFromMemory(false),
>> + MayWriteToMemory(false), MatchingId(-1), Ptr(nullptr) {
>> MayReadFromMemory = Inst->mayReadFromMemory();
>> MayWriteToMemory = Inst->mayWriteToMemory();
>> if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(Inst)) {
>> @@ -402,22 +402,22 @@ private:
>> MatchingId = Info.MatchingId;
>> MayReadFromMemory = Info.ReadMem;
>> MayWriteToMemory = Info.WriteMem;
>> - Vol = Info.Vol;
>> + IsSimple = Info.IsSimple;
>> Ptr = Info.PtrVal;
>> }
>> } else if (LoadInst *LI = dyn_cast<LoadInst>(Inst)) {
>> Load = true;
>> - Vol = !LI->isSimple();
>> + IsSimple = LI->isSimple();
>> Ptr = LI->getPointerOperand();
>> } else if (StoreInst *SI = dyn_cast<StoreInst>(Inst)) {
>> Store = true;
>> - Vol = !SI->isSimple();
>> + IsSimple = SI->isSimple();
>> Ptr = SI->getPointerOperand();
>> }
>> }
>> bool isLoad() const { return Load; }
>> bool isStore() const { return Store; }
>> - bool isVolatile() const { return Vol; }
>> + bool isSimple() const { return IsSimple; }
>> bool isMatchingMemLoc(const ParseMemoryInst &Inst) const {
>> return Ptr == Inst.Ptr && MatchingId == Inst.MatchingId;
>> }
>> @@ -430,7 +430,7 @@ private:
>> private:
>> bool Load;
>> bool Store;
>> - bool Vol;
>> + bool IsSimple;
>> bool MayReadFromMemory;
>> bool MayWriteToMemory;
>> // For regular (non-intrinsic) loads/stores, this is set to -1. For
>> @@ -554,8 +554,8 @@ bool EarlyCSE::processNode(DomTreeNode *
>> ParseMemoryInst MemInst(Inst, TTI);
>> // If this is a non-volatile load, process it.
>> if (MemInst.isValid() && MemInst.isLoad()) {
>> - // Ignore volatile loads.
>> - if (MemInst.isVolatile()) {
>> + // Ignore volatile or ordered loads.
>> + if (!MemInst.isSimple()) {
>> LastStore = nullptr;
>> // Don't CSE across synchronization boundaries.
>> if (Inst->mayWriteToMemory())
>> @@ -662,8 +662,8 @@ bool EarlyCSE::processNode(DomTreeNode *
>> MemInst.getPtr(),
>> LoadValue(Inst, CurrentGeneration, MemInst.getMatchingId()));
>>
>> - // Remember that this was the last store we saw for DSE.
>> - if (!MemInst.isVolatile())
>> + // Remember that this was the last normal store we saw for DSE.
>> + if (MemInst.isSimple())
>> LastStore = Inst;
>> }
>> }
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151204/b2f5cce1/attachment.html>
More information about the llvm-commits
mailing list