[LLVMdev] Tight overlapping loops and performance
Eli Friedman
eli.friedman at gmail.com
Mon Mar 2 13:41:45 PST 2009
On Mon, Mar 2, 2009 at 11:38 AM, Jonathan Turner <probata at hotmail.com> wrote:
> With gcc -O3 4.2 and 4.4 we match 1.0s. The LLVM, after running it through
> opt -std-compile-opts, is around 1.7s.
Hmm, on my computer, I get around 2.5 seconds with both gcc -O3 and
llvm-gcc -O3 (using llvm-gcc from svn). Not sure what you're doing
differently; I wouldn't be surprised if it's sensitive to the version
of LLVM.
> Should I be looking at any particular optimization passes that aren't in
> -std-compile-opts to match the gcc speeds?
First, try looking at the generated code... the code LLVM generates is
probably not what you're expecting. I'm getting the following for the
main loop:
.LBB1_1: # loopto
cmpl $1, %eax
leal -1(%eax), %eax
cmove %edx, %eax
incl %ecx
cmpl $999999999, %ecx
jne .LBB1_1 # loopto
LLVM is optimizing your oddly nested loops into a single loop which
does some extra computation to keep track of the timeout variable.
Since you'd normally be doing something non-trivial in the timeout
portion of the loop, the results you're getting with this contrived
testcase are irrelevant to your actual issue.
In general, you'll probably get better results from LLVM with properly
nested loops; LLVM's loop optimizers don't know how to deal with deal
with overlapping loops. I'd suggest writing it more like the
following:
int timeout = 2000;
int loopcond;
do {
timeoutwork();
do {
timeout--;
loopcond = computationresult();
} while (loopcond && timeout);
} while (loopcond);
-Eli
More information about the llvm-dev
mailing list