[llvm-commits] [PATCH] Rotate CallInst operands -- HEADS UP

Wed Apr 21 11:09:06 PDT 2010

On 15 abr, 14:20, Gabor Greif <ggr... at gmail.com> wrote:
> On Apr 13, 6:18 pm, Daniel Dunbar <dan... at zuster.org> wrote:
>
> > Hi Gabor,
>
> > What is the measured performance impact of this change?
>
> >  - Daniel
>
> Hi Daniel,
>
> I have constructed a testcase with 26000 callsites, and the "opt -
> inline" times
> are unchanged (still trying to get gnuplot to visualize). Anyway, it
> gets lost in the noise.
> I did not expect much more, since the win per callsite is in the
> 50-100 instructions range
> which means 26,000,000 instructions on a modern processor putting it
> in the 10 ms range.

Okay, I finally came around doing my homework. In my previous
testcase I had no callsite that opt could inline. Now I have
added a static forwarding function with 5 arguments.

This is the result:
--------------------------------------------
#include <iostream>
#include <cstdio>

template <unsigned N>
struct fib;

static void myfprintf(FILE* f, const char* forma, const char* a, const
char* b, int c);

template <>
struct fib<0>
{
 unsigned long is(void)
 {
   myfprintf(stderr, "In %s at %s:%d\n", __FUNCTION__, __FILE__,
__LINE__);
   return 0;
 }
};

template <>
struct fib<1> : fib<0>
{
 unsigned long is(void)
 {
   myfprintf(stderr, "In %s at %s:%d\n", __FUNCTION__, __FILE__,
__LINE__);
   return fib<0>::is() + 1;
 }
 static unsigned long (fib::*isP)(void);
};

template <unsigned N>
struct fib
{
 unsigned long is(void)
 {
   myfprintf(stderr, "In %s at %s:%d\n", __FUNCTION__, __FILE__,
__LINE__);
   return fib<N - 2>().is() + fib<N - 1>().is();
 }
};

int main(void)
{
 const unsigned N(12999);
 std::cout << "fib(" << N << ") is " << fib<N>().is() << "\n";
}

static void myfprintf(FILE* f, const char* forma, const char* a, const
char* b, int c)
{
 std::fprintf(f, forma, a, b, c);
}
--------------------------------------------

compiled this with clang++ -O0
then I used a release-asserts opt with and without the described patch
to optimize the bitcode produced by clang++ with -inline option,
suppressing
verifier and output.

I made 40 measurements on linux x86-64 each and visualized the result
I got from -time-passes.

Below I linked  the graphs (x: user-time, y: wall-time)
for CFG construction and inlining.
Both show 2-3% improvement.

http://idisk.mac.com/gabor/Public/inl.png
http://idisk.mac.com/gabor/Public/cfg.png

Cheers,

    Gabor

PS: I am not implying that this is a scientifically correct test, but
it
_does_ show an effect.

>
> However the stripped binaries get 4k smaller for opt, clang lli etc.
> on x86_64 linux.
>
> Cheers,
>
>     Gabor
>
> _______________________________________________
> llvm-commits mailing list
> llvm-comm... at cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits