[llvm] r200213 - [vectorizer] Teach the loop vectorizer's unroller to only unroll by
Tobias Grosser
tobias at grosser.es
Mon Feb 3 07:55:54 PST 2014
On 01/27/2014 12:12 PM, Chandler Carruth wrote:
> Author: chandlerc
> Date: Mon Jan 27 05:12:24 2014
> New Revision: 200213
>
> URL: http://llvm.org/viewvc/llvm-project?rev=200213&view=rev
> Log:
> [vectorizer] Teach the loop vectorizer's unroller to only unroll by
> powers of two. This is essentially always the correct thing given the
> impact on alignment, scaling factors that can be used in addressing
> modes, etc. Also, fix the management of the unroll vs. small loop cost
> to more accurately model things with this world.
>
> Enhance a test case to actually exercise more of the unroll machinery if
> using synthetic constants rather than a specific target model. Before
> this change, with the added flags this test will unroll 3 times instead
> of either 2 or 4 (the two sensible answers).
>
> While I don't expect this to make a huge difference, if there are lots
> of loops sitting right on the edge of hitting the 'small unroll' factor,
> they might change behavior. However, I've benchmarked moving the small
> loop cost up and down in many various ways and by a huge factor (2x)
> without seeing more than 0.2% code size growth. Small adjustments such
> as the series that led up here have led to about 1% improvement on some
> benchmarks, but it is very close to the noise floor so I mostly checked
> that nothing regressed. Let me know if you see bad behavior on other
> targets but I don't expect this to be a sufficiently dramatic change to
> trigger anything.
Just for info,
this change caused the following performance changes on X86_64 (median
of 10 runs ensures the noise is filtered out reliably).
Compile Time
==============
Δ Previous Current
fftbench 2.73% 0.4400 0.4520
stepanov_abstraction 1.86% 0.6440 0.6560
simple_types_constant_folding 1.22% 2.7942 2.8282
loop_unroll 1.16% 3.9642 4.0103
Execution Time
==============
Δ Previous Current
ControlFlow-dbl 2.08% 4.5283 4.6223
fannkuch 2.04% 3.2282 3.2942
ControlFlow-flt 1.67% 4.0723 4.1403
pairlocalalign 1.24% 25.8096 26.1296
Δ Previous Current
gramschmidt -12.92% 2.5402 2.2121
gcc-loops -7.11% 4.8403 4.4963
siod -6.23% 3.0162 2.8282
LinearDependence-flt -2.39% 4.3603 4.2563
http://llvm.org/perf/db_default/v4/nts/21395?num_comparison_runs=10&test_filter=&test_min_value_filter=&aggregation_fn=median&compare_to=21392&submit=Update
Cheers,
Tobias
More information about the llvm-commits
mailing list