[LLVMdev] RFC: Loop versioning for LICM
Hal Finkel
hfinkel at anl.gov
Thu Mar 5 21:17:22 PST 2015
----- Original Message -----
> From: "Ashutosh Nema" <Ashutosh.Nema at amd.com>
> To: llvmdev at cs.uiuc.edu
> Sent: Thursday, February 26, 2015 4:31:18 AM
> Subject: [LLVMdev] RFC: Loop versioning for LICM
> I like to propose a new loop multi versioning optimization for LICM.
> For now I kept this for LICM only, but it can be used in multiple
> places.
> The main motivation is to allow optimizations stuck because of memory
> alias dependencies. Most of the time when alias analysis is unsure
> about
> memory access and it says may-alias. This un surety from alias
> analysis restrict
> some of the memory based optimizations to proceed further.
> We observed some cases with LICM, where things are beyond aliasing.
> In cases where alias analysis is unsure we like to use loop
> versioning as an alternative.
> Loop Versioning will creates version of the loop with aggressive
> alias and the other
> with conservative (default) alias. Aggressive alias version of loop
> will have all the
> memory access marked as no-alias. These two version of loop will be
> preceded by a
> memory runtime check. This runtime check consists of bound checks for
> all unique memory
> accessed in loop, and it ensures aliasing of memory. Based on this
> check result at runtime
> any of the loops gets executed, if memory is non aliased then
> aggressive aliasing loop
> gets executed, else when memory is aliased then non aggressive
> aliased version gets executed.
Hi Ashutosh,
I think this is a really interesting idea, and I'd like to encourage you to continue working on it. Regarding profitability, I think you'll want to check that you'll be able to hoist/sink a significant number of the memory accesses inside the new "versioned" loop. If a loop has a large number of accesses, and the new loop body only differs by the hosting/sinking of a few, then you're unlikely to see the difference.
-Hal
> By setting no-alias to memory accessed in aggressive alias version of
> loop, enable other
> optimization to continue further.
> Following are the top level steps:
> 1) Perform loop do versioning feasibility check.
> 2) If loop is a candidate for versioning then create a memory bound
> check, by considering
> all the memory access in loop body.
> 3) Clone original loop and set all memory access as no-alias in new
> loop.
> 4) Set original loop & versioned loop as a branch target of runtime
> check result.
> 5) Call LICM on aggressive alias versioned of loop(For now LICM is
> scheduled later and not directly
> called from LoopVersioning pass).
> Consider following test:
> 1 int foo(int * var1, int * var2, int * var3, unsigned itr) {
> 2 unsigned i = 0, j = 0;
> 3 for(; i < itr; i++) {
> 4 for(; j < itr; j++) {
> 5 var1[j] = itr + i;
> 6 var3[i] = var1[j] + var3[i];
> 7 }
> 8 }
> 9 }
> At line #6 store to var3 can be moved out by
> LICM(promoteLoopAccessesToScalars)
> but because of alias analysis un surety about memory access it unable
> to move it out.
> After Loop versioning IR:
> <Versioned Loop>
> for.body3.loopVersion: ; preds = %for.body3.loopVersion.preheader,
> %for.body3.loopVersion
> %indvars.iv.loopVersion = phi i64 [ %indvars.iv.next.loopVersion,
> %for.body3.loopVersion ], [ %2, %for.body3.loopVersion.preheader ]
> %arrayidx.loopVersion = getelementptr inbounds i32* %var1, i64
> %indvars.iv.loopVersion
> store i32 %add, i32* %arrayidx.loopVersion, align 4, !tbaa !1,
> !alias.scope !11, !noalias !11
> %indvars.iv.next.loopVersion = add nuw nsw i64
> %indvars.iv.loopVersion, 1
> %lftr.wideiv.loopVersion = trunc i64 %indvars.iv.loopVersion to i32
> %exitcond.loopVersion = icmp eq i32 %lftr.wideiv.loopVersion, %0
> br i1 %exitcond.loopVersion, label %for.inc11.loopexit38, label
> %for.body3.loopVersion
> <Original Loop>
> for.body3: ; preds = %for.body3.lr.ph, %for.body3
> %indvars.iv = phi i64 [ %indvars.iv.next, %for.body3 ], [ %2,
> %for.body3.lr.ph ]
> %arrayidx = getelementptr inbounds i32* %var1, i64 %indvars.iv
> store i32 %add, i32* %arrayidx, align 4, !tbaa !1
> %8 = load i32* %arrayidx7, align 4, !tbaa !1
> %add8 = add nsw i32 %8, %add
> store i32 %add8, i32* %arrayidx7, align 4, !tbaa !1
> %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
> %lftr.wideiv = trunc i64 %indvars.iv to i32
> %exitcond = icmp eq i32 %lftr.wideiv, %0
> br i1 %exitcond, label %for.inc11, label %for.body3
> In versioned loop difference is visible, 1 store has moved out.
> Following are some high level details about current implementation:
> - LoopVersioning
> LoopVersioning is main class which holds multi versioning
> functionality.
> - LoopVersioning :: isVersioningBeneficial
> Its member to ‘LoopVersioning’
> Does feasibility check for loop versioning.
> a) Checks layout of loop.
> b) Instruction level check.
> c) memory checks.
> - LoopVersioning :: versionizeLoop
> a) Clone original loo
> b) Create a runtime memory check.
> c) Add both loops under runtime check results target.
> - RuntimeMemoryCheck
> This class take cares runtime memory check.
> - RuntimeMemoryCheck ::createRuntimeCheck
> It creates runtime memory check.
> In this patch used maximum loop nest threshold as 2, and maximum
> number
> of pointers in runtime memory check as 5.
> Later I like to make this as a utility so others can use it.
> Requesting to go through patch for detailed approach.
> Patch available at http://reviews.llvm.org/D7900
> Suggestions are comments are welcome.
> Regards,
> Ashutosh
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150305/e675f705/attachment.html>
More information about the llvm-dev
mailing list