# [PATCH] D27919: [Loop Vectorizer] Interleave vs Gather - in some cases Gather is better.

Matthew Simpson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 19 06:19:09 PST 2017

mssimpso added inline comments.

================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7008-7009
// the scalar version.
if (Legal->isUniformAfterVectorization(I))
VF = 1;

----------------
mssimpso wrote:
> delena wrote:
> > mssimpso wrote:
> > > Hi Elena,
> > >
> > > I had been thinking about the use of isUniformAfterVectorization() here in getInstructionCost(). Wouldn't it now be possible for the set of uniforms to differ from the first collection (before VF selection) and the second collection (after VF selection)? So we would choose a VF based on costs assuming an instruction may or may not be uniform. Then we could later reverse our initial decision about the instruction's uniformity after VF selection, making the total cost on which we based our VF decision inaccurate. Or am I missing something? I haven't yet thought through the implications of this in enough detail to know whether this would matter much or not.
> > About the list of Uniforms. We insert and then remove only GEPs and Induction variables. We do not calculate cost for them anyway. All other Uniform values stay in place. So, the cost is accurate at the end. There is no circular dependency here.
> I don't think this is true in general. We mark an instruction uniform if all its users are uniform. So for example, if we have a uniform GEP whose index is some computation, that computation is also uniform if it's only used by the GEP. I think we have some examples in induction.ll, but something like this:
>
> ```
> %i = phi i64 [ 0, %entry ], [ %i.next, %for.body ]
> %sum = add i64 %i, %x
> %idx = getelementptr inbounds float, float* %a, i64 %sum
> load float, float* %idx, align 4
> ```
>
> The GEP is consecutive, so it will be marked uniform. %sum will aslo be marked uniform because it's only used by the GEP. If we later decide to scalarize the load, the GEP, the IV, and %sum will all no longer be uniform. So the cost for %sum will have been wrong.
Just a thought - why not recompute and cache the uniforms (and possibly scalars) for each VF we compute costs for? That would avoid any potential logical inconsistencies. I think the compile-time overhead would probably be minimal (and you're already computing these sets twice anyway).

Repository:
rL LLVM

https://reviews.llvm.org/D27919

More information about the llvm-commits mailing list