[LLVMdev] Performance regression on ARM
meheff at google.com
Fri Oct 17 10:53:28 PDT 2014
I submitted r219517 (not in the suspect range) which improved SCEV by
enabling computation of trip counts if the loop has multiple exits. One of
the things this enables is loop unrolling for loops with multiple exits.
Then Chandler submitted r219550 and r219562 (in the suspect range) to fix
some latent bugs that my change exposed (thanks, btw). The regressing
*does* have a loop with multiple exits (the core loop of the benchmark).
However, it's not unrolled with -O3 because the trip count of the loop
latch is not computable at compile time. Nor is it vectorizable (the
vectorizor uses trip count computation). So, I'd speculate the reason for
the regression lies elsewhere (complex arithmetic?).
On Fri, Oct 17, 2014 at 7:51 AM, Anton Korobeynikov <anton at korobeynikov.info
> > Chandler's complex arithmetic changes are also in the range: r219557 in
> clang. We saw it change the code in mandel-2 significantly.
> mandel-2 is broken on hard FP ABI systems, btw. The reason is simply:
> we're emitting a call to __muldc3 with AAPCS VFP calling convention,
> however, the function expects softfp (AAPCS) calling conv and reads
> garbage from GP registers.
> I'm working on fix.
> With best regards, Anton Korobeynikov
> Faculty of Mathematics and Mechanics, Saint Petersburg State University
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev