[PATCH] Scalarize select vector arguments when extracted

Tue Sep 10 13:59:36 PDT 2013

Matt, 

I looked at the test case again and I think that there is a possible solution. If we start vectorizing the tree and use the insert_element instructions as roots then we would be able to vectorize this code. At the moment the insert_element instructions are handled outside of the buildTree method, so they interfere with the tree-construction. 

Thanks,
Nadav

On Sep 10, 2013, at 1:51 PM, Nadav Rotem <nrotem at apple.com> wrote:

> 
> On Sep 10, 2013, at 1:18 PM, Matt Arsenault <arsenm2 at gmail.com> wrote:
> 
>> 
>> On Sep 10, 2013, at 12:26 , Nadav Rotem <nrotem at apple.com> wrote:
>> 
>>> Hi Matt, 
>>> 
>>> I am sorry I missed your first email. Thanks for working on this. 
>>> 
>>>> When the elements are extracted from a select, do the select on the extracted scalars from the input.
>>>> 
>>> 
>>> I am not sure that this is always profitable or that it should be done in InstCombine. It is hard to know at IR-level (during inst-combine) if it is better to extract twice and perform a single select or if it is better to perform an vector select and extract once. I think that this kind of decision should be made in SelectionDAG. 
>> 
>> I was following your suggestion from the case I asked about here: http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-August/065033.html
>> 
>> My first thought was to do this for the scalar select on vectors case, which I think would more universally be better. I wasn't sure about the vselect case. Would it make sense to always do only the scalar case here?
>> 
> 
> I am sorry for not being clear.  In that email I meant that some basic vectorization or vectorization cleanups (such as getting rid of insert-extract sequences) can be done in InstCombine. This should only be viewed as canonicalization. In your patch you are scalarizing the code in preparation for the SLP-vectorizer to handle the scalar code. I don’t think that this is a good approach because the SLP-vectorizer may miss some patterns. I think that the best approach for handling the problem that you described in the email from August is to improve the SLP-vectorizer. 
> 
> Your test case is very important because this is a very common pattern in graphics code.  One of the limitations of the SLP-vectorizer is that it has to honor scheduling constraints because it does not operate on a DAG (for compile time reasons). Unfortunately I don’t have a good solution for this problem right now. 
> 
> Thanks,
> Nadav
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130910/2e629be4/attachment.html>