[PATCH] D19431: [LoopDist] Add llvm.loop.distribute.enable loop metadata

Tue Apr 26 11:33:14 PDT 2016

anemet added a comment.

In http://reviews.llvm.org/D19431#412282, @hfinkel wrote:

> I have no problem with adding metadata to control this, just as we do with the vectorizer. However:
>
> 1. What is preventing us from enabling this pass by default?

The big piece is profitability.  We need to prove that no ILP or MLP is impeded by distribution.  Without this we can actually regress performance even if we vectorize some of the resulting loops (in my DevMeeting talk, the hmmer loop in SPECint without LoopLoadElimination is an example for the ILP case).  I have some ideas for MLP (essentially to prove that the loop HW prefetcher friendly so misses should occur before and after).

ILP is trickier because I would have to estimate the length of the critical path and see if we still have enough independent instructions in the resulting loop to avoid exposing the critical loop by much.   This may again be easier for the simple cases but for hmmer for example this needs to be fairly accurate.  Obviously we can start with the simple model initially.

We also need to ensure that the loop without the unsafe deps will be vectorized at the end.  I.e. factor out the legality and profitability checks from the vectorizer.  (There are some other efforts in this area, so this may happen independently.)

And of course independently from all this the user may still want to force distribution because it's known to be profitable.  So even if the pass is on by default this feature in some form is still necessary.

> 2. Do we actually need to key it this way? Given that this metadata does not force anything, and is intended to enable vectorization, why not just key this off of !{!"llvm.loop.vectorize.enable", i1 1}. That way, should the user explicitly request vectorization, and we detect that this won't be possible, we try extra hard to provide it? That is, is there a use case where a user might want #pragma clang loop distribute(enable) without also adding vectorize(enable)?

That's an interesting idea but I think at the end we also want to give the user full control to know what transformation are taking place.  Imagine we had many transformations that could turn a loop from non-vectorizable into vectorizable (e.g. peeling, data-layout modifications, etc.), the user may want to choose the particular transformation rather than just "vectorize at any cost".

Also as I mentioned it to your first point, we don't actually know if we are going to vectorize the resulting loop, so tying this to vectorize.enable can be misleading.  I think at this point I am more comfortable going with the more low-level control for distribution.  What do you think?

http://reviews.llvm.org/D19431