[PATCH] D81416: [LV] Interleave to expose ILP for small loops with scalar reductions.

Aaron H Liu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 17 13:48:18 PDT 2020


AaronLiu added inline comments.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:253
+    "interleave-small-loop-scalar-reduction", cl::init(false), cl::Hidden,
+    cl::desc("Enable interleaving for small loops with scalar reductions "
+             "to expose ILP."));
----------------
nikic wrote:
> dmgreen wrote:
> > xbolva00 wrote:
> > > bmahjour wrote:
> > > > fhahn wrote:
> > > > > xbolva00 wrote:
> > > > > > Turn on by default? 
> > > > > > 
> > > > > > If you ran some benchmarks and no regressions, I see no reason why this should be off by default.
> > > > > It would be good to at least give some details on the benchmarks run. Ideally they would include MultiSource & various version of SPEC on X86 and ideally also other platforms.
> > > > The current measurements are done on IBM Power. It would be good if someone with access to other types of performance machines could help measure the impact of this change on other platforms. If not, can we leave the default enablement to a future patch?
> > > > 
> > > > In general, how are performance testing on multiple platforms performed by the community, prior to enabling a feature?
> > > @dmgreen arm
> > > @nikic x86?
> > This sounds like unrolling to me. But with pointer runtime checks to allow extra ILP?
> > 
> > Something like that would usually be a target decision, in the unroller controlled by getUnrollingPreferences or for the vectorizer controlled by other calls like enableAggressiveInterleaving. Targets can then opt in to the feature if they expect to find them useful.
> > 
> > If it is expected to be more universally applicable then you can try and just enable it and see if people report regressions. But some X86 benchmarks using the llvm testsuite would probably be prudent first.
> > 
> > The (sub)target I run on most (MVE) will not enable interleaving nor AggressiveInterleaving, so probably isn't very helpful for performance numbers.
> @xbolva00 I don't have any run-time numbers, I only check compile-time (there's no impact there at least).
> The current measurements are done on IBM Power. It would be good if someone with access to other types of performance machines could help measure the impact of this change on other platforms. If not, can we leave the default enablement to a future patch?
> 
Can someone help to test this patch on other platforms? Thanks!




CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81416/new/

https://reviews.llvm.org/D81416



More information about the llvm-commits mailing list