[PATCH/RFC] Pre-increment preparation pass

Wed Jan 30 22:59:11 PST 2013

Hi Hal, 

Isn't this something that needs to go into LSR ?

Thanks,
Nadav

On Jan 29, 2013, at 1:31 PM, Hal Finkel <hfinkel at anl.gov> wrote:

> Hello again,
> 
> When targeting PPC A2 cores, use of pre-increment loads and stores is very important for performance. The fundamental problem with the generation of pre-increment instructions is that most loops are not naturally written to take advantage of them. For example, a loop written as:
> for (int i = 0; i < N; ++i) {
>  x[i] = y[i]
> }
> needs to be transformed to look more like this:
> T *a = x[-1], *b = y[-1];
> for (int i = 0; i < N; ++i) {
>  *++a = *++b;
> }
> 
> I've attached a pass that performs this transformation. For unrolled loops (or other situations with multiple accesses to the same array), the lowest-offset use is transformed into the pre-increment access and the others are updated to base their addressing off of the pre-increment access. I schedule this pass after LSR is run.
> 
> This seems to work pretty well, but I don't know if this is the best way of doing what I'd like. Should this really be part of LSR somehow?
> 
> In case you're curious, for inner loops (where this really matters), the induction variable is often moved into the special loop counter register, and so removing the dependence of the addressing on the induction variable is also a good thing.
> 
> Thanks again,
> Hal
> 
> -- 
> Hal Finkel
> Postdoctoral Appointee
> Leadership Computing Facility
> Argonne National Laboratory
> <incamprep.patch>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits