[PATCH] Unrolling improvements (target indep. and for x86)

Fri Feb 21 23:19:00 PST 2014

----- Original Message -----
> From: "Chandler Carruth" <chandlerc at google.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "llvm-commits" <llvm-commits at cs.uiuc.edu>, "Nadav Rotem" <nrotem at apple.com>, "Diego Novillo"
> <dnovillo at google.com>
> Sent: Saturday, February 22, 2014 1:05:17 AM
> Subject: Re: [PATCH] Unrolling improvements (target indep. and for x86)
> 
> 
> 
> 
> 
> On Fri, Feb 21, 2014 at 10:45 PM, Hal Finkel < hfinkel at anl.gov >
> wrote:
> 
> 
> Chandler pointed out to me last week that recent x86 cores can also
> benefit from partial unrolling because of how their uop buffers and
> loop-stream detectors work (both Intel and AMD chips are similar in
> this regard).
> I just want to add a specific point of realization that occurred to
> me when we were discussing this, and influenced my feeling that we
> should look into using the partial unroller *in addition* to the
> loop vectorizer's unrolling.
> 
> 
> The latter is, rightfully, about widening the loop. It exposes ILP
> and other benefits. It is *not*, however, suitable to one thing
> which it is currently being used for: unrolling *purely* to hide the
> branch cost and/or properly fill the LSD or uop cache. For these
> purposes, restricting the unrolling to that which can be done in an
> *interleaved* fashion isn't always reasonable. Instead, we should
> also support doing this through concatentation.
> 
> 
> My general feeling is that we should essentially use the same
> size-upper-bound metric in both the vectorizer's unroller and this
> one, and unroll through interleaving as much as we can (subject to
> the independence of the iterations), and then continue unrolling
> with concatentation until we saturate whatever buffer size the
> targets wants.

Currently, the vectorizer unrolls only for ILP (or latency hiding), subject to the register pressure estimate, and having it also unroll for size would be a change to the current behavior. Is that your desire?

 -Hal

> 
> 
> That make sense to folks?

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory