[llvm-dev] [Proposal][RFC] Epilog loop vectorization

Nema, Ashutosh via llvm-dev llvm-dev at lists.llvm.org
Wed Mar 15 00:36:31 PDT 2017



From: Michael Kuperstein [mailto:mkuper at google.com]
Sent: Tuesday, March 14, 2017 10:29 PM
To: Nema, Ashutosh <Ashutosh.Nema at amd.com>
Cc: Adam Nemet <anemet at apple.com>; Hal Finkel <hfinkel at anl.gov>; Renato Golin <renato.golin at linaro.org>; llvm-dev <llvm-dev at lists.llvm.org>; Mehdi Amini <mehdi.amini at apple.com>; Daniel Berlin <dberlin at dberlin.org>; Zaks, Ayal <ayal.zaks at intel.com>
Subject: RE: [llvm-dev] [Proposal][RFC] Epilog loop vectorization

I'm still not sure about this, for a few reasons:

1) I'd like to try to treat epilogue loops the same way regardless of whether the main loop was vectorized by hand or automatically. So if someone hand-wrote an avx-512 16-wide loop, with alias checks, and we decide it's profitable to vectorize the epilogue loop by 4 and re-use the checks, it ought to be done the same way. I realize this may be a pipe-dream, though.

Ideally it should be like this, but introduction of alias checks comes with its own challenges.

2) I'm still somewhat worried about "tiny loops". As I wrote before, we explicitly refuse to vectorize loops we know have a trip-count less than 16, because our profitability heuristic for such loops is probably bad. IIUC the only reason we don't bail due to the threshold is because we use the same loop for "failed min iters check" and "failed alias check". So, because it's reachable through the alias-check path, the max trip count isn't actually known, even though the typical trip count is probably small.
It's true that you currently don't try to vectorize the epilogue if the original VF is below 16, but this is a somewhat different condition.

Prerequisite for epilog vectorization is the original loop should get vectorize, for tiny loops if vectorizer refuse to vectorize then epilog version will not be generated.
Once we will have the proper costing for checks (i.e. alias, min-itr) then we can make more accurate decision to vectorize epilog loop by considering checks cost.

3) Technically speaking, constructing a new InnerLoopVectorizer to vectorize this one loop sounds weird. We already have a worklist in the vectorizer that's currently running.

Adding epilog loop to the loop list comes with following challenges:


a)      If we like to add the epilog loop to the list then I’m not sure how we will handle the “alias check result”.

For epilog vector loop after minimum iteration check it should check the result of already computed “alias check result” if the result asserts ‘alias’ then execute scalar epilog loop else execute epilog vector loop. As the execution of the epilog vector loop is dependent of already computed “alias check result”, not sure how we will check this fact by adding loop to the list.

Probably loop versioning based on the “alias check result” condition followed by adding the no-alias version to the list can help here. i.e.

[cid:image001.jpg at 01D29D8C.F47BC4E0]

If “Scalar LoopVersion1” asserts no alias property then add it to the loop list.

This versioning design looks weird, it’s just used to show the “alias check result” fact.



Any other thoughts handling “alias check result” without versioning ?



b)      Loop Vectorizer anyway creates instance of “InnerLoopVectorizer” for vectorizing each loop, the only difference is we are creating instance of “InnerLoopVectorizer” after generating first vector version within the processing of same loop to cater epilog loop vectorization. The intent is when vectorizer is processing a loop from the list it should process it completely by generating both original and epilog vector version.



c)       In the proposed patch, instead of creating a new instance of “InnerLoopVectorizer”, we can clear the state of existing “InnerLoopVectorizer” object and use it for epilog loop vectorization.


I don't think (1) is a blocker, and (3) should be easy to fix, but I'm not sure whether the way this is going to handle (2) is sufficient.  If I'm the only one that this bothers, I won't stand in the way, but I'd like to at least make sure we've fully considered this.

Regards,
Ashutosh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170315/1639054e/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 38373 bytes
Desc: image001.jpg
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170315/1639054e/attachment-0001.jpg>


More information about the llvm-dev mailing list