[LLVMdev] [llvm] r184698 - Add a flag to defer vectorization into a phase after the inliner and its

Mon Jun 24 14:25:56 PDT 2013

----- Original Message -----
> 
> 
> On Mon, Jun 24, 2013 at 12:32 PM, Hal Finkel < hfinkel at anl.gov >
> wrote:
> 
> 
> 
> 
> > In
> > < http://llvm.org/viewvc/llvm-project?view=revision&revision=184698
> > >
> > Chandler introduced a flag so that we can run the vectorizer after
> > all CG passes. This would prevent the inline from seeing the
> > vectorized code.
> 
> There are obviously several issues here, and they seem only loosely
> related. Regarding this first one, why is the right answer not to
> adjust (or improve) the inlining heuristic? I understand that this
> is not easy, but the fact remains that, in the end, having the loop
> inlined, even with the extra vectorization checks, is what should be
> happening (or is the performance still worse than the non-vectorized
> code?). If we really feel that we can't adjust the current heuristic
> without breaking other things, then we could add some metadata to
> make the cost estimator ignore the vector loop preheader, but I'd
> prefer adjusting the inlining thresholds, etc. The commit message
> for r184698 said that the flag was for experimentation purposes, and
> I think that's fine, but this should not be the solution unless it
> really produces better non-inlined code as well.
> If all we are doing is inlining first in order to tweak the cost
> model for inlining, then I agree with everything you say... but I'm
> not at all sure that's the end result.
> 
> 
> After inlining, the control flow and even loop structure may look
> substantially different than they do before inlining, so the
> vectorizer may make a substantially different decision about whether
> or not to vectorize a particular loop. As one (somewhat contrived)
> example, if the loop body is inlined into a cold region of the
> enclosing function, the vectorizer might be able to prioritize code
> size over performance and skip vectorization.

Unfortunately, this can also be a problem. Loops that the vectorizer could understand and vectorize prior to inlining can become loops that the vectorizer cannot understand after inlining (and, as things currently stand, it does not take much: IIRC, a preheader is still required). In the future, I imagine that things will get better (but they could also get worse, if we do loop fusion for example).

> 
> 
> I think the fundamental problem is that we can't un-vectorize, and
> the decision (and strategy) to vectorize is based on a cost model,
> so the later we do it the more information we can use in the cost
> model to make the correct decision.

I agree.

Thanks again,
Hal

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory