[PATCH] Fix PR19657 : SLP vectorization doesn't combine scalar load to vector loads

Nadav Rotem nrotem at apple.com
Tue May 27 22:25:41 PDT 2014


Karthik, 

Please document the code that you are adding to the vectorization of binary operators.  

+          buildTree_rec(Left, Depth + 1);
+       }
+       else {

Please format your patch properly. 

Also, would it be better to swap Right and Left instead of duplicating the calls to buildTree_rec?

The performance numbers look okay. Do you know why we see a nice gain in the one workloads that wins?

Thanks,
Nadav


On May 27, 2014, at 9:12 PM, Karthik Bhat <kv.bhat at samsung.com> wrote:

> Hi Nadav,
> Please find the performance result with and without patch. I dont see large regression in compilation time though execution time of one test case improved greatly. The baseline is without patch and current is with patch.
> 
> {F60858}
> 
> Hi Raul,Arnold
> I agree that the current patch will not handle the case mentioned. I was thinking of handling unschedulable loads again after buildTree_rec was completed but as arnold mentioned i'm not sure if this would be worth the ovehead.
> 
> For now do you think we can move ahead with this approach as we are able to vectorize loads similar to one's mentioned in the PR without much overhead?
> 
> Thanks  for all the comments and review.
> 
> Regards
> Karthik Bhat
> 
> http://reviews.llvm.org/D3800
> 
> 




More information about the llvm-commits mailing list