[llvm-commits] Global Merge Pass for ARM
Eric Christopher
echristo at apple.com
Sun May 16 18:57:37 PDT 2010
>
> Please find the patch which can be viewed as some early approximation of
> "section anchors" feature seen in gcc.
>
> It tries to solve the following problem: consider the code touches
> several global variables at once, e.g.:
>
> <=cut=>
> static int foo[N], bar[N], baz[N];
>
> for (i = 0; i < N; ++i) {
> foo[i] = bar[i] * baz[i];
> }
> <=cut=>
>
> On ARM the addresses of 3 arrays should be kept in the registers, thus
> this code has quite large register pressure (loop body):
>
> ldr r1, [r5], #4
> ldr r2, [r6], #4
> mul r1, r2, r1
> str r1, [r0], #4
>
> Pass converts the code to something like:
>
> <=cut=>
> static struct {
> int foo[N];
> int bar[N];
> int baz[N];
> } merged;
>
> for (i = 0; i < N; ++i) {
> merged.foo[i] = merged.bar[i] * merged.baz[i];
> }
> <=cut=>
>
> and in ARM code this becomes:
> ldr r0, [r5, #40]
> ldr r1, [r5, #80]
> mul r0, r1, r0
> str r0, [r5], #4
>
> note that we saved 2 registers here.
>
This is pretty cool and simple so far. What's the benchmarking look like?
> This way only the address of the merged structured needs to be kept in
> the registers. For the fields accesses ldr/str with offsets are used.
> Pass correctly distinguishes constant and non-constant globals. Maximum
> size of the struct dependes on the instruction set used (it's 4095 for
> ARM/Thumb2 and 127 for Thumb1).
>
> Maybe PPC can benefit from this pass as well, but I'm not yet sure.
>
> Ok to commit?
At the very least do you mind including the above writeup in a big block comment
on the pass? I make no claims about the rest of it, but the pass is pretty
sparse on documentation and much easier to read with the example there.
Thanks.
-eric
More information about the llvm-commits
mailing list