[LLVMdev] Memory optimizations for LLVM JIT

王振江 cryst216 at gmail.com
Wed Aug 21 02:16:29 PDT 2013


Thank you very much for your explanations and suggestions.
I'm sorry that I have provided some wrong information last time: llc is
(probably?) not able to optimize such code either.

I tried something more according to the suggestions. Here are the results:
(using the same core code shown in the last email)
1. compile to object file  (clang -O3 -c test.c)
    : good code quality
2. compile to bitcode file (clang -O3 -c test.c -emit-llvm)
   : good
3. compile to bitcode file (clang -O0 -c test.c -emit-llvm)
   : bad, similar IR as I wrote manually
4. opt test.bc file in step 3                  (opt -O3 test.bc)
    : bad
5. compile to assembly, from test.bc in step 3 (llc -O3 test.bc)
    : bad
6. IR creation source, from test.bc in step 3  (llc -O3 -march=cpp test.bc)
   : bad, similar IR as I wrote manually
7. JIT or MCJIT the source in step 6           (modify and call jit/mcjit)
    : bad

In short, once the source is converted to bad bitcode (or equivalent IR
creation), I cannot optimize it back to the -O3 quality.
What can be the reason? Did the bitcode file lose some high level
information, so that certain optimizations are limited?
If so, is it possible to reconstruct some naive metadata to enable such
optimization? (just for this piece of code, as it is the most important
scenario in my project)

Any help will/would be appreciated.


The source of test.c
---------------------------------------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>

struct S {
    long long a[10];
} *p;

void foo () {
        p->a[2] = p->a[1];
        p->a[3] = p->a[1];
        p->a[4] = p->a[2];
        p->a[5] = p->a[4];
}

int main() {
        p = (struct S*) malloc(sizeof(struct S));
        p->a[1] = rand();
        foo();
        printf("%lld\n", p->a[5]);
        return 0;
}
---------------------------------------------------------------------------------------


2013/8/21 Richard Osborne <richard at xmos.com>

>  On 20 Aug 2013, at 08:23, 王振江 <cryst216 at gmail.com> wrote:
>
>
>  A GlobalValue was declared and mapped to the variable p.
> Some LLVM IR instructions were created according to those generated by
> LLVM from source.
> I.e., load p, load a[1] based on p, load p again, store a[2] based on p,
> etc.
> The machine code turned out to be slightly optmized, as shown on the left.
>
>
> I suspect this is due to possible aliasing. If p somehow pointed to itself
> then the store p->a[x] might change the value of of p so p must be reloaded
> each time. Clang will emit TBAA metadata nodes (
> http://llvm.org/docs/LangRef.html#tbaa-metadata) that let the optimizers
> know the load of p can't alias the stores through p since they are have
> different high-level types. Without the TBAA metadata the optimizers must
> be conservative.
>
>
>  Things were getting better after the GlobalVariable of p was set as a
> constant.
> Redundant Loads of p (line 5, 8 and 11) were removed, and so was line 12
> because of line 10.
>
>
> This makes sense - if p is constant no store can possibly change the value
> of p so it doesn't need to be reloaded.
>
>  However, I could not make it better any more, although optimal machine
> code just need those marked with '*'.
>
> This is strange, I'm not what sure what is going on here - assuming you
> are running the same passes I'd expect no difference here.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130821/4db5c625/attachment.html>


More information about the llvm-dev mailing list