[llvm-commits] Global Merge Pass for ARM

Eric Christopher echristo at apple.com
Sun May 16 18:57:37 PDT 2010


> 
> Please find the patch which can be viewed as some early approximation of
> "section anchors" feature seen in gcc. 
> 
> It tries to solve the following problem: consider the code touches
> several global variables at once, e.g.:
> 
> <=cut=>
> static int foo[N], bar[N], baz[N];
> 
> for (i = 0; i < N; ++i) {
>  foo[i] = bar[i] * baz[i];
> }
> <=cut=>
> 
> On ARM the addresses of 3 arrays should be kept in the registers, thus
> this code has quite large register pressure (loop body):
> 
>        ldr     r1, [r5], #4
>        ldr     r2, [r6], #4
>        mul     r1, r2, r1
>        str     r1, [r0], #4
> 
> Pass converts the code to something like:
> 
> <=cut=>
> static struct {
>  int foo[N];
>  int bar[N];
>  int baz[N];
> } merged;
> 
> for (i = 0; i < N; ++i) {
>  merged.foo[i] = merged.bar[i] * merged.baz[i];
> }
> <=cut=>
> 
> and in ARM code this becomes:
>        ldr     r0, [r5, #40]
>        ldr     r1, [r5, #80]
>        mul     r0, r1, r0
>        str     r0, [r5], #4
> 
> note that we saved 2 registers here.
> 

This is pretty cool and simple so far.  What's the benchmarking look like?

> This way only the address of the merged structured needs to be kept in
> the registers. For the fields accesses ldr/str with offsets are used.
> Pass correctly distinguishes constant and non-constant globals. Maximum
> size of the struct dependes on the instruction set used (it's 4095 for
> ARM/Thumb2 and 127 for Thumb1).
> 
> Maybe PPC can benefit from this pass as well, but I'm not yet sure.
> 
> Ok to commit?

At the very least do you mind including the above writeup in a big block comment
on the pass?  I make no claims about the rest of it, but the pass is pretty
sparse on documentation and much easier to read with the example there.

Thanks.

-eric



More information about the llvm-commits mailing list