[PATCH] D31965: [SLP] Enable 64-bit wide vectorization for Cyclone

Adam Nemet via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 11 17:28:55 PDT 2017


anemet created this revision.
Herald added subscribers: mzolotukhin, aemerson.

ARM Neon has native support for half-sized vector registers (64 bits).  This
is beneficial for example for 2D and 3D graphics.  This patch adds the option
to lower MinVecRegSize from 128 via a TTI in the SLP Vectorizer.

- Performance Analysis

This change was motivated by some internal benchmarks but it is also
beneficial on SPEC and the LLVM testsuite.

The results are with -O3 and PGO.  A negative percentage is an improvement.
The testsuite was run with a sample size of 4.

- SPEC

- CFP2006/482.sphinx3  -3.34%

A pretty hot loop is SLP vectorized resulting in nice instruction reduction.
This used to be a +22% regression before https://reviews.llvm.org/rL299482.

- CFP2000/177.mesa     -3.34%
- CINT2000/256.bzip2   +6.97%

My current plan is to extend the fix in https://reviews.llvm.org/rL299482 to i16 which brings the
regression down to +2.5%.  There are also other problems with the codegen in
this loop so there is further room for improvement.

- LLVM testsuite

- SingleSource/Benchmarks/Misc/ReedSolomon               -10.75%

There are multiple small SLP vectorizations outside the hot code.  It's a bit
surprising that it adds up to 10%.  Some of this may be code-layout noise.

- MultiSource/Benchmarks/VersaBench/beamformer/beamformer -8.40%

The opt-viewer screenshot can be seen at https://reviews.llvm.org/F3218284.  We start at a colder store
but the tree leads us into the hottest loop.

- MultiSource/Applications/lambda-0.1.3/lambda            -2.68%
- MultiSource/Benchmarks/Bullet/bullet                    -2.18%

This is using 3D vectors.

- SingleSource/Benchmarks/Shootout-C++/Shootout-C++-lists +6.67%

Noise, binary is unchanged.

- MultiSource/Benchmarks/Ptrdist/anagram/anagram          +4.90%

There is an additional SLP in the cold code.  The test runs for ~1sec and
prints out over 2000 lines. This is most likely noise.

- MultiSource/Applications/aha/aha                        +1.63%
- MultiSource/Applications/JM/lencod/lencod               +1.41%
- SingleSource/Benchmarks/Misc/richards_benchmark         +1.15%


https://reviews.llvm.org/D31965

Files:
  include/llvm/Analysis/TargetTransformInfo.h
  include/llvm/Analysis/TargetTransformInfoImpl.h
  lib/Analysis/TargetTransformInfo.cpp
  lib/Target/AArch64/AArch64Subtarget.cpp
  lib/Target/AArch64/AArch64Subtarget.h
  lib/Target/AArch64/AArch64TargetTransformInfo.h
  lib/Transforms/Vectorize/SLPVectorizer.cpp
  test/Transforms/SLPVectorizer/AArch64/64-bit-vector.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D31965.94912.patch
Type: text/x-patch
Size: 6187 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170412/592fe16d/attachment.bin>


More information about the llvm-commits mailing list