[PATCH] D15559: [SCEVExpander] Make findExistingExpansion smarter

Thu Dec 24 02:33:19 PST 2015

chatur01 added a comment.

In http://reviews.llvm.org/D15559#315149, @flyingforyou wrote:

> Current algorithm doesn't care about where does trip count's division come from. 
>  That means, if TripCount is divided by some value in loop's preheader, compiler will give up doing unrolling. (Even if IV's step is one or minus one.)
>
> If IV's step is constant likes one or minus one or multiple of 2, we don't need to generate division for computing trip count.

I ran some benchmarks against http://reviews.llvm.org/differential/diff/43408/

spec regressions on a Cortex-A57 (A64):

  spec.cpu2000.ref.253_perlbmk 	4.71%
  spec.cpu2000.ref.256_bzip2 	2.07%
  spec.cpu2006.ref.433_milc 	1.24%

There's a small improvement in 254_gap

  spec.cpu2000.ref.254_gap 	-1.76%

(I'm focussing on SPEC because it's less noisy than the benchmarks in test-suite, although I do note a 23% improvement in `lnt.MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan` and a 7.76% regression in `lnt.SingleSource/Benchmarks/Shootout-C++/ackermann`)

spec regressions on a Cortex-A57 (T32):

  spec.cpu2006.ref.482_sphinx3 	2.31%

and an improvement in:

  spec.cpu2000.ref.181_mcf 	-2.13%

> I tested this patch on r256132 with test-suite, spec2000,2006, commercial benchmarks. There is no regression on Cortex-A57.

Did you notice these regressions on your hardware? I find it strange to think this could all be accounted to different revisions of this core. I'm testing on this platform: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dto0038a/index.html)

I see some small improvments in a commerical benchmark (1-2%), but it is worth mentioning the SPEC changes, since I wouldn't write those off as no regressions; it's definite regression in SPEC, but whether the other improvements in test-suite balance them in the view of the community is debatable.

http://reviews.llvm.org/D15559