[PATCH] Fix PR19657 : SLP vectorization doesn't combine scalar load to vector loads

Mon May 19 13:40:58 PDT 2014

>
> I like your idea of adding a flag to control the performance/compile time tradeoff.  One place that could benefit from this flag is the part of the code where we sort the store instructions. In order to reduce the compile time we place the sorts in small buckets and sort them individually. Increasing the size of the buckets can allow more vectorization opportunities. Another place is the consecutive memory address tests where we could throw in a few more checks.
>

*nod*

> My main concern with the patch in this thread is that it duplicates a ton of code. The most complicated part of the SLP-vectorizer is the recursion that scans the tree and duplicating that code just to change a few lines will make it unmaintainable. I also suspect that this is not the correct approach to solving this problem, but I must admit that I did not look at the problem carefully.
>

Sure, I just didn't want to reject the idea out of hand as you seemed
to be doing because it might increase compile time. Having the pass
check the opt level when it's running is pretty easy.

-eric

> Moving forward I would like to see us do a better job on swizzeling loads. I think that adding support for reverse loads would be easy to do and allow the vectorization of many more patterns.
>
> Thanks,
> Nadav
>
>> Is
>> there some other approach you'd like to get optimizations like this?
>>
>> -eric
>>
>> http://reviews.llvm.org/D3800
>>
>>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits