[llvm] r211888 - [x86] Begin a significant overhaul of how vector lowering is done in the

Nadav Rotem nrotem at apple.com
Fri Jun 27 10:12:54 PDT 2014


> On Jun 27, 2014, at 10:01 AM, Chandler Carruth <chandlerc at gmail.com> wrote:
> 
> 
> On Fri, Jun 27, 2014 at 6:54 PM, Nadav Rotem <nrotem at apple.com> wrote:
> Hi Chandler,
> 
> Thank you for working on this. Lowering shuffles on X86 is challenging and I am glad that you are rewriting and improving this code. Everything looks great.
> 
> >
> > Once SSE2 is polished a bit I should be able to get interesting numbers
> > on performance improvements on benchmarks conducive to vectorization.
> > All of this will be off by default until it is functionally equivalent
> > of course.
> 
> I was wondering how you plan to benchmark this code. The vectorizers don’t generate interesting shuffle pattern (mainly reverse and broadcast)
> 
> My work here is motivated specifically by shuffles generated by the vectorizer, so I'm not sure why you think they don't generate interesting shuffle patterns.

The vectorizers don’t generate interesting shuffle patterns because it is difficult to predict the cost of shuffles. We use the ShuffleKind enum in TTI to query the backends on the cost of specific instructions such as reverse, broadcast and and alternate. For loop vectorization I don’t think that there are other interesting shuffle patterns. Can you think of other patterns that can help loop vectorization? For SLP-vectorization, shuffling loads could help, but the problem is cost estimation and not inefficient lowering.  

>  
> and traditionally interesting shuffle patterns came from hand written code and from code in the domain of graphics, like OpenCL, OpenGL and ISPC. Is there a specific benchmark that you think could be useful?
> 
> I have a decent number of both hand vectorized code and code which vectorizes well; I expect to be able to get reasonable baseline benchmarks from this.

Good. I think that the hand-vectorized code benchmarks would be useful. 

> 
> Also, it is pretty easy to look at the output before and after and understand pretty clearly the likely performance characteristics.

Yes. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140627/2796966d/attachment.html>


More information about the llvm-commits mailing list