[llvm-dev] [Proposal][RFC] Epilog loop vectorization

Mon Feb 27 15:35:33 PST 2017

On 02/27/2017 04:19 PM, Zaks, Ayal wrote:
>
> On 02/27/2017 12:41 PM, Michael Kuperstein wrote:
>
>     There's another issue with re-running the vectorizer (which I
>     support, btw - I'm just saying there are more problems to solve on
>     the way :-) )
>
>     Historically, we haven't even tried to evaluate the cost of the
>     "constant" (not per-iteration) vectorization overhead - things
>     like alias checks. Instead, we have hard bounds - we won't perform
>     alias checks that are "too expensive", and, more importantly, we
>     don't even try to vectorize loops with known low iteration counts.
>     The bound right now is 16, IIRC. That means we don't have a good
>     way to evaluate whether vectorizing a loop with a low iteration
>     count is profitable or not.
>
>
> We should really improve this as well.
>
> @Michael: OTOH, we should reach the same decision again (i.e., that of 
> performing the alias checks) when encountering the remainder loop as 
> we did with the original loop, given that hard bounds are used ;-).
>
> But agreed, it is better to evaluate the cost of these bounds along 
> with the overall estimated cost instead.
>
>     This also makes me wary of the "we can clean up redundant alias
>     checks later" approach. When trying to decide whether to vectorize
>     by 4 a loop that has no more than 8 iterations (because we just
>     vectorized by 8 and it's the remainder loop), we really want to
>     know if the alias checks we're introducing are going to survive a not.
>
>
> It occurs to me that, if SCEV's known-predicate logic were smart 
> enough, it would seem practical to not introduce redundant checks in 
> the first place (although it would imply some gymnastics when 
> examining the control flow around the loop and then restructuring 
> things when we generate the code for the loop).
>
> The scalar remainder loop, when reached from the vectorized loop, is 
> already known to be vectorizable to a VF larger than EpilogVF.
>

I was not under the impression we had a remainder loop separate from the 
loop used for scalar computation. Don't we use the same loop in cases 
where the vectorization is not legal?

  -Hal

> No need to introduce again any potential aliasing, wrapping or whatnot 
> checks, even if this redundancy can later be eliminated, if instead 
> this vectorizability property could be recorded somehow. Similar to 
> having annotated the remainder loop with “#pragma clang loop 
> vectorize(assume_safety)”, except that this vectorizability property 
> does not hold when reaching the remainder loop along the other path – 
> that which fails these checks for the main loop...
>
> Ayal.
>
>
>  -Hal
>
>
>     Michael
>
>     On Mon, Feb 27, 2017 at 10:11 AM, Hal Finkel <hfinkel at anl.gov
>     <mailto:hfinkel at anl.gov>> wrote:
>
>         On 02/27/2017 11:47 AM, Adam Nemet wrote:
>
>                 On Feb 27, 2017, at 9:39 AM, Daniel Berlin
>                 <dberlin at dberlin.org <mailto:dberlin at dberlin.org>> wrote:
>
>                 On Mon, Feb 27, 2017 at 9:29 AM, Adam Nemet
>                 <anemet at apple.com <mailto:anemet at apple.com>> wrote:
>
>                         On Feb 27, 2017, at 7:27 AM, Hal Finkel
>                         <hfinkel at anl.gov <mailto:hfinkel at anl.gov>> wrote:
>
>
>                         On 02/27/2017 06:29 AM, Nema, Ashutosh wrote:
>
>                             Thanks for looking into this.
>
>                             1) Issues with re running vectorizer:
>
>                             Vectorizer might generate redundant alias
>                             checks while vectorizing epilog loop.
>
>                             Redundant alias checks are expensive, we
>                             like to reuse the results of already
>                             computed alias checks.
>
>                             With metadata we can limit the width of
>                             epilog loop, but not sure about reusing
>                             alias check result.
>
>                             Any thoughts on rerunning vectorizer with
>                             reusing the alias check result ?
>
>
>                         One way of looking at this is: Reusing the
>                         alias-check result is really just a
>                         conditional propagation problem; if we don't
>                         already have an optimization that can combine
>                         these after the fact, then we should.
>
>                     +Danny
>
>                     Isn’t Extended SSA supposed to help with this?
>
>                 Yes, it will solve this with no issue already.  GVN
>                 probably does already too.
>
>                 even if if you have
>
>                 if (a == b)
>
>                 if (a == c)
>
>                  if (a == d)
>
>                  if (a == e)
>
>                  if (a == g)
>
>                 and  we can prove a ... g equivalent, newgvn will
>                 eliminate them all and set all the branches true.
>
>                 If you need a simpler clean up pass, we could run it
>                 on sub-graphs.
>
>             Yes we probably don’t want to run a full GVN after the
>             “loop-scheduling” passes.
>
>
>         FWIW, we could, just without the memory-dependence analysis
>         enabled (i.e. set the NoLoads constructor parameter to true).
>         GVN is pretty fast in that mode.
>
>          -Hal
>
>
>             I guess the pipeline to experiment with for now is opt
>             -loop-vectorize -loop-vectorize -newgvn.
>
>             Adam
>
>
>
>                 The only thing you'd have to do is write some code to
>                 set "live on entry" subgraph variables in their own
>                 congruence classes.
>
>                 We already do this for incoming arguments.
>
>                 Otherwise, it's trivial to make it only walk things in
>                 the subgraph.
>
>
>
>         -- 
>
>         Hal Finkel
>
>         Lead, Compiler Technology and Programming Languages
>
>         Leadership Computing Facility
>
>         Argonne National Laboratory
>
>
>
> -- 
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170227/9d95f876/attachment.html>