[llvm] r263072 - [LICM] Store promotion when memory is thread local
Philip Reames via llvm-commits
llvm-commits at lists.llvm.org
Wed Mar 9 15:12:58 PST 2016
Fixed in 263073.
Philip
On 03/09/2016 03:09 PM, Philip Reames via llvm-commits wrote:
> FYI, it looks like I screwed up when rebasing this and broke the
> build. Testing a fix now. I'll either have it fixed or reverted
> within a couple of minutes.
>
> On 03/09/2016 02:59 PM, Philip Reames via llvm-commits wrote:
>> Author: reames
>> Date: Wed Mar 9 16:59:30 2016
>> New Revision: 263072
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=263072&view=rev
>> Log:
>> [LICM] Store promotion when memory is thread local
>>
>> This patch teaches LICM's implementation of store promotion to
>> exploit the fact that the memory location being accessed might be
>> provable thread local. The fact it's thread local weakens the
>> requirements for where we can insert stores since no other thread can
>> observe the write. This allows us perform store promotion even in
>> cases where the store is not guaranteed to execute in the loop.
>>
>> Two key assumption worth drawing out is that this assumes a)
>> no-capture is strong enough to imply no-escape, and b) standard
>> allocation functions like malloc, calloc, and operator new return
>> values which can be assumed not to have previously escaped.
>>
>> In future work, it would be nice to generalize this so that it works
>> without directly seeing the allocation site. I believe that the
>> nocapture return attribute should be suitable for this purpose, but
>> haven't investigated carefully. It's also likely that we could
>> support unescaped allocas with similar reasoning, but since SROA and
>> Mem2Reg should destroy those, they're less interesting than they
>> first might seem.
>>
>> Differential Revision: http://reviews.llvm.org/D16783
>>
>>
>> Added:
>> llvm/trunk/test/Transforms/LICM/promote-tls.ll
>> Modified:
>> llvm/trunk/include/llvm/Transforms/Utils/LoopUtils.h
>> llvm/trunk/lib/Transforms/Scalar/LICM.cpp
>>
>> Modified: llvm/trunk/include/llvm/Transforms/Utils/LoopUtils.h
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/Utils/LoopUtils.h?rev=263072&r1=263071&r2=263072&view=diff
>> ==============================================================================
>>
>> --- llvm/trunk/include/llvm/Transforms/Utils/LoopUtils.h (original)
>> +++ llvm/trunk/include/llvm/Transforms/Utils/LoopUtils.h Wed Mar 9
>> 16:59:30 2016
>> @@ -373,8 +373,8 @@ bool hoistRegion(DomTreeNode *, AliasAna
>> bool promoteLoopAccessesToScalars(AliasSet &,
>> SmallVectorImpl<BasicBlock*> &,
>> SmallVectorImpl<Instruction*> &,
>> PredIteratorCache &, LoopInfo *,
>> - DominatorTree *, Loop *,
>> AliasSetTracker *,
>> - LICMSafetyInfo *);
>> + DominatorTree *, const
>> TargetLibraryInfo *,
>> + Loop *, AliasSetTracker *,
>> LICMSafetyInfo *);
>> /// \brief Computes safety information for a loop
>> /// checks loop body & header for the possibility of may throw
>>
>> Modified: llvm/trunk/lib/Transforms/Scalar/LICM.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LICM.cpp?rev=263072&r1=263071&r2=263072&view=diff
>> ==============================================================================
>>
>> --- llvm/trunk/lib/Transforms/Scalar/LICM.cpp (original)
>> +++ llvm/trunk/lib/Transforms/Scalar/LICM.cpp Wed Mar 9 16:59:30 2016
>> @@ -35,10 +35,12 @@
>> #include "llvm/Analysis/AliasAnalysis.h"
>> #include "llvm/Analysis/AliasSetTracker.h"
>> #include "llvm/Analysis/BasicAliasAnalysis.h"
>> +#include "llvm/Analysis/CaptureTracking.h"
>> #include "llvm/Analysis/ConstantFolding.h"
>> #include "llvm/Analysis/GlobalsModRef.h"
>> #include "llvm/Analysis/LoopInfo.h"
>> #include "llvm/Analysis/LoopPass.h"
>> +#include "llvm/Analysis/MemoryBuiltins.h"
>> #include "llvm/Analysis/ScalarEvolution.h"
>> #include "llvm/Analysis/ScalarEvolutionAliasAnalysis.h"
>> #include "llvm/Analysis/TargetLibraryInfo.h"
>> @@ -223,7 +225,7 @@ bool LICM::runOnLoop(Loop *L, LPPassMana
>> // Loop over all of the alias sets in the tracker object.
>> for (AliasSet &AS : *CurAST)
>> Changed |= promoteLoopAccessesToScalars(AS, ExitBlocks,
>> InsertPts,
>> - PIC, LI, DT, CurLoop,
>> + PIC, LI, DT, TLI,
>> CurLoop,
>> CurAST, &SafetyInfo);
>> // Once we have promoted values across the loop body we have
>> to recursively
>> @@ -846,7 +848,9 @@ bool llvm::promoteLoopAccessesToScalars(
>> SmallVectorImpl<BasicBlock*>&ExitBlocks,
>> SmallVectorImpl<Instruction*>&InsertPts,
>> PredIteratorCache &PIC,
>> LoopInfo *LI,
>> - DominatorTree *DT, Loop
>> *CurLoop,
>> + DominatorTree *DT,
>> + const TargetLibraryInfo *TLI,
>> + Loop *CurLoop,
>> AliasSetTracker *CurAST,
>> LICMSafetyInfo * SafetyInfo) {
>> // Verify inputs.
>> @@ -879,10 +883,25 @@ bool llvm::promoteLoopAccessesToScalars(
>> //
>> // is not safe, because *P may only be valid to access if 'c' is
>> true.
>> //
>> + // The safety property divides into two parts:
>> + // 1) The memory may not be dereferenceable on entry to the loop.
>> In this
>> + // case, we can't insert the required load in the preheader.
>> + // 2) The memory model does not allow us to insert a store along
>> any dynamic
>> + // path which did not originally have one.
>> + //
>> // It is safe to promote P if all uses are direct load/stores and
>> if at
>> // least one is guaranteed to be executed.
>> bool GuaranteedToExecute = false;
>> + // It is also safe to promote P if we can prove that speculating
>> a load into
>> + // the preheader is safe (i.e. proving dereferenceability on all
>> + // paths through the loop), and that the memory can be proven
>> thread local
>> + // (so that the memory model requirement doesn't apply.) We first
>> establish
>> + // the former, and then run a capture analysis below to establish
>> the later.
>> + // We can use any access within the alias set to prove
>> dereferenceability
>> + // since they're all must alias.
>> + bool CanSpeculateLoad = false;
>> +
>> SmallVector<Instruction*, 64> LoopUses;
>> SmallPtrSet<Value*, 4> PointerMustAliases;
>> @@ -892,6 +911,16 @@ bool llvm::promoteLoopAccessesToScalars(
>> AAMDNodes AATags;
>> bool HasDedicatedExits = CurLoop->hasDedicatedExits();
>> + // Don't sink stores from loops without dedicated block exits.
>> Exits
>> + // containing indirect branches are not transformed by loop simplify,
>> + // make sure we catch that. An additional load may be generated in
>> the
>> + // preheader for SSA updater, so also avoid sinking when no preheader
>> + // is available.
>> + if (!HasDedicatedExits || !Preheader)
>> + return false;
>> +
>> + const DataLayout &MDL = Preheader->getModule()->getDataLayout();
>> +
>> // Check that all of the pointers in the alias set have the same
>> type. We
>> // cannot (yet) promote a memory location that is loaded and
>> stored in
>> // different sizes. While we are at it, collect alignment and AA
>> info.
>> @@ -918,6 +947,12 @@ bool llvm::promoteLoopAccessesToScalars(
>> assert(!Load->isVolatile() && "AST broken");
>> if (!Load->isSimple())
>> return Changed;
>> +
>> + if (!GuaranteedToExecute && !CanSpeculateLoad)
>> + CanSpeculateLoad =
>> + isSafeToExecuteUnconditionally(*Load, DT, TLI, CurLoop,
>> + SafetyInfo,
>> + Preheader->getTerminator());
>> } else if (const StoreInst *Store = dyn_cast<StoreInst>(UI)) {
>> // Stores *of* the pointer are not interesting, only stores
>> *to* the
>> // pointer.
>> @@ -926,13 +961,6 @@ bool llvm::promoteLoopAccessesToScalars(
>> assert(!Store->isVolatile() && "AST broken");
>> if (!Store->isSimple())
>> return Changed;
>> - // Don't sink stores from loops without dedicated block
>> exits. Exits
>> - // containing indirect branches are not transformed by loop
>> simplify,
>> - // make sure we catch that. An additional load may be
>> generated in the
>> - // preheader for SSA updater, so also avoid sinking when no
>> preheader
>> - // is available.
>> - if (!HasDedicatedExits || !Preheader)
>> - return Changed;
>> // Note that we only check GuaranteedToExecute inside the
>> store case
>> // so that we do not introduce stores where they did not
>> exist before
>> @@ -953,6 +981,14 @@ bool llvm::promoteLoopAccessesToScalars(
>> GuaranteedToExecute = isGuaranteedToExecute(*UI, DT,
>> CurLoop,
>> SafetyInfo);
>> +
>> + if (!GuaranteedToExecute && !CanSpeculateLoad) {
>> + CanSpeculateLoad =
>> + isDereferenceableAndAlignedPointer(Store->getPointerOperand(),
>> + Store->getAlignment(), MDL,
>> + Preheader->getTerminator(),
>> + DT, TLI);
>> + }
>> } else
>> return Changed; // Not a load or store.
>> @@ -968,8 +1004,17 @@ bool llvm::promoteLoopAccessesToScalars(
>> }
>> }
>> - // If there isn't a guaranteed-to-execute instruction, we can't
>> promote.
>> - if (!GuaranteedToExecute)
>> + // Check legality per comment above. Otherwise, we can't promote.
>> + bool PromotionIsLegal = GuaranteedToExecute;
>> + if (!PromotionIsLegal && CanSpeculateLoad) {
>> + // If this is a thread local location, then we can insert stores
>> along
>> + // paths which originally didn't have them without violating the
>> memory
>> + // model.
>> + Value *Object = GetUnderlyingObject(SomePtr, MDL);
>> + PromotionIsLegal = isAllocLikeFn(Object, TLI) &&
>> + !PointerMayBeCaptured(Object, true, true);
>> + }
>> + if (!PromotionIsLegal)
>> return Changed;
>> // Figure out the loop exits and their insertion points, if
>> this is the
>>
>> Added: llvm/trunk/test/Transforms/LICM/promote-tls.ll
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LICM/promote-tls.ll?rev=263072&view=auto
>> ==============================================================================
>>
>> --- llvm/trunk/test/Transforms/LICM/promote-tls.ll (added)
>> +++ llvm/trunk/test/Transforms/LICM/promote-tls.ll Wed Mar 9
>> 16:59:30 2016
>> @@ -0,0 +1,133 @@
>> +; RUN: opt -tbaa -basicaa -licm -S < %s | FileCheck %s
>> +
>> +; If we can prove a local is thread local, we can insert stores during
>> +; promotion which wouldn't be legal otherwise.
>> +
>> +target datalayout =
>> "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
>> +target triple = "x86_64-linux-generic"
>> +
>> + at p = external global i8*
>> +
>> +declare i8* @malloc(i64)
>> +
>> +; Exercise the TLS case
>> +define i32* @test(i32 %n) {
>> +entry:
>> + ;; ignore the required null check for simplicity
>> + %mem = call dereferenceable(16) noalias i8* @malloc(i64 16)
>> + %addr = bitcast i8* %mem to i32*
>> + br label %for.body.lr.ph
>> +
>> +for.body.lr.ph: ; preds = %entry
>> + br label %for.header
>> +
>> +for.header:
>> + %i.02 = phi i32 [ 0, %for.body.lr.ph ], [ %inc, %for.body ]
>> + %old = load i32, i32* %addr, align 4
>> + ; deliberate impossible to analyze branch
>> + %guard = load volatile i8*, i8** @p
>> + %exitcmp = icmp eq i8* %guard, null
>> + br i1 %exitcmp, label %for.body, label %early-exit
>> +
>> +early-exit:
>> +; CHECK-LABEL: early-exit:
>> +; CHECK: store i32 %new1.lcssa, i32* %addr, align 1
>> + ret i32* null
>> +
>> +for.body:
>> + %new = add i32 %old, 1
>> + store i32 %new, i32* %addr, align 4
>> + %inc = add nsw i32 %i.02, 1
>> + %cmp = icmp slt i32 %inc, %n
>> + br i1 %cmp, label %for.header, label %for.cond.for.end_crit_edge
>> +
>> +for.cond.for.end_crit_edge: ; preds = %for.body
>> +; CHECK-LABEL: for.cond.for.end_crit_edge:
>> +; CHECK: store i32 %new.lcssa, i32* %addr, align 1
>> + %split = phi i32* [ %addr, %for.body ]
>> + ret i32* null
>> +}
>> +
>> +declare i8* @not_malloc(i64)
>> +
>> +; Negative test - not TLS
>> +define i32* @test_neg(i32 %n) {
>> +entry:
>> + ;; ignore the required null check for simplicity
>> + %mem = call dereferenceable(16) noalias i8* @not_malloc(i64 16)
>> + %addr = bitcast i8* %mem to i32*
>> + br label %for.body.lr.ph
>> +
>> +for.body.lr.ph: ; preds = %entry
>> + br label %for.header
>> +
>> +for.header:
>> + %i.02 = phi i32 [ 0, %for.body.lr.ph ], [ %inc, %for.body ]
>> + %old = load i32, i32* %addr, align 4
>> + ; deliberate impossible to analyze branch
>> + %guard = load volatile i8*, i8** @p
>> + %exitcmp = icmp eq i8* %guard, null
>> + br i1 %exitcmp, label %for.body, label %early-exit
>> +
>> +early-exit:
>> +; CHECK-LABEL: early-exit:
>> +; CHECK-NOT: store
>> + ret i32* null
>> +
>> +for.body:
>> +; CHECK-LABEL: for.body:
>> +; CHECK: store i32 %new, i32* %addr, align 4
>> + %new = add i32 %old, 1
>> + store i32 %new, i32* %addr, align 4
>> + %inc = add nsw i32 %i.02, 1
>> + %cmp = icmp slt i32 %inc, %n
>> + br i1 %cmp, label %for.header, label %for.cond.for.end_crit_edge
>> +
>> +for.cond.for.end_crit_edge: ; preds = %for.body
>> +; CHECK-LABEL: for.cond.for.end_crit_edge:
>> +; CHECK-NOT: store
>> + %split = phi i32* [ %addr, %for.body ]
>> + ret i32* null
>> +}
>> +
>> +; Negative test - can't speculate load since branch
>> +; may control alignment
>> +define i32* @test_neg2(i32 %n) {
>> +entry:
>> + ;; ignore the required null check for simplicity
>> + %mem = call dereferenceable(16) noalias i8* @malloc(i64 16)
>> + %addr = bitcast i8* %mem to i32*
>> + br label %for.body.lr.ph
>> +
>> +for.body.lr.ph: ; preds = %entry
>> + br label %for.header
>> +
>> +for.header:
>> + %i.02 = phi i32 [ 0, %for.body.lr.ph ], [ %inc, %for.body ]
>> + ; deliberate impossible to analyze branch
>> + %guard = load volatile i8*, i8** @p
>> + %exitcmp = icmp eq i8* %guard, null
>> + br i1 %exitcmp, label %for.body, label %early-exit
>> +
>> +early-exit:
>> +; CHECK-LABEL: early-exit:
>> +; CHECK-NOT: store
>> + ret i32* null
>> +
>> +for.body:
>> +; CHECK-LABEL: for.body:
>> +; CHECK: store i32 %new, i32* %addr, align 4
>> + %old = load i32, i32* %addr, align 4
>> + %new = add i32 %old, 1
>> + store i32 %new, i32* %addr, align 4
>> + %inc = add nsw i32 %i.02, 1
>> + %cmp = icmp slt i32 %inc, %n
>> + br i1 %cmp, label %for.header, label %for.cond.for.end_crit_edge
>> +
>> +for.cond.for.end_crit_edge: ; preds = %for.body
>> +; CHECK-LABEL: for.cond.for.end_crit_edge:
>> +; CHECK-NOT: store
>> + %split = phi i32* [ %addr, %for.body ]
>> + ret i32* null
>> +}
>> +
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list