[llvm-dev] IC value with reduction on LoopVectorizationCostModel::selectInterleaveCount
Adhemerval Zanella via llvm-dev
llvm-dev at lists.llvm.org
Tue Oct 2 11:00:31 PDT 2018
Hi all,
I am trying to understand in which basis the loop-vectorizer optimization
optimal Interleave Count (IC) with reductions was added:
lib/Transforms/Vectorize/LoopVectorize.cpp:
4773 // Interleave if we vectorized this loop and there is a reduction that could
4774 // benefit from interleaving.
4775 if (VF > 1 && !Legal->getReductionVars()->empty()) {
4776 LLVM_DEBUG(dbgs() << "LV: Interleaving because of reductions.\n");
4777 return IC;
4778 }
The IC in this context will be within [1, MaxInterleaveCount] and
MaxInterleaveCount will be set based target defaults. The issue is for
unbounded loops (when trip count can't be infered) where vectorization is
beneficial even for small element count, the loop-vectorization will use the
architecture defined IC. And if the arch-defined IC is higher than 2 the
vectorization code path won't be used element count less than IC*VF.
For instance the code snippet:
---
#include <float.h>
struct vec {
float x, y;
};
struct polyshape {
int count;
struct vec v[8];
};
float foo (const polyshape *poly, float x1, float x2, float y1, float y2)
{
float si = FLT_MAX;
for (int j=0; j<poly->count; j++)
{
float sij = x1 * (poly->v[j].x + x2) + y1 * (poly->v[j].y - y2);
if (sij < si)
si = sij;
}
return si;
}
---
When building for aarch64-linux-gnu (which has default IC for 2) loop-vectorizer
debug will show:
LV: Interleaving because of reductions.
LV: Found a vectorizable loop (4) in test.cc
LV: Interleave Count is 2
Setting best plan to VF=4, UF=2
LV: Interleaving disabled by the pass manager
And then the vectorized code path won't be used for polyshape->count between 4
and 8.
Is this optimization for reduced case indeed beneficial for all cases? Can't
the rest of LoopVectorizationCostModel::selectInterleaveCount infer a better
IC for reduction cases?
More information about the llvm-dev
mailing list