[llvm] r180802 - InstCombine: Fold more shuffles of shuffles.

Tue Apr 30 15:01:26 PDT 2013

On Apr 30, 2013, at 2:38 PM, Nadav Rotem <nrotem at apple.com> wrote:

> 
> On Apr 30, 2013, at 2:23 PM, Jim Grosbach <grosbach at apple.com> wrote:
> 
>> That used to be true. These days, we see lots of shuffles coming from SROA and we have to mess with them or things go very badly.  
> 
> You can change SROA to reduce the number of shuffle vectors. I scanned SROA and I did not see anything too bad. I only saw insert/extract and blend, which should map nicely to any vector ISA.  But you can make things better by changing SROA.cpp::insertVector to generate a select instruction, instead of a vector shuffle.

It’s more complicated than that. Have a look at the example on the PR that motives this patch to see what I mean. If you have an alternative suggestion, I’d love to hear it.

>  
>> The more I think about it, the more I disagree with the current design. It creates a very deep tight coupling between what InstCombine does and the least-common-denominator of what target codegen can handle. That’s untenable in the long term and artificially restrictive.
> 
> The motivation for the current design (which I am happy with) is that it is very difficult to undo a canonicalization of shuffles because the search space is huge. Also, the clang shuffle builtins are used by people who optimize their performance sensitive applications. 
> 

This code is mostly pretty ancient. I wouldn’t read very much into the current implementation with regards to motivations. A lot has changed in the last few years. In any case, because something is difficult doesn’t mean it’s not the right thing to do. Don’t get me wrong, I’m not un-sympathetic to being pragmatic here. We have to be. There’s always a balance between the theoretically best approach and what we do in practice out of expedience. In this case, however, we’re aggressively pessimizing one target to help out another. This patch flips that around for which target gets pessimized. My contention is that we can do better than the current extremely conservative heuristic and fix that.

We obviously shouldn’t immediately remove all the “be careful here or targets will probably freak out” safeguards. That would have nasty results, to put it mildly. We should, however, be actively working to find ways to allow this (and other) code to be as aggressive as possible. The question for the moment is what additional conditions can we add that will allow the transformation for the situation we want to enable and still avoid it when it’s troublesome. Suggestions?

-Jim

> We use this approach in other places in the compiler. We don't always canonicalize all the way and in many cases we prefer not to touch the program unless we are certain that the change is beneficial.  The shuffle ISA is so sparse that merging shuffles are likely to cause regressions. 
>    

We do. And it’s a compromise every single time. It’s technical debt, and we should be actively working to reduce it, not add to it.

> 
>>> When I looked at the test case I noticed that some of your shuffles were actually vector blends. It would make sense to convert blend-shuffles into select instructions with a constant mask.    
>>> 
>> 
>> Sounds like a good idea. I don’t know much about the provenance of the original IR of these test cases.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130430/4e6391c1/attachment.html>