[PATCH] Fix PR19657 : SLP vectorization doesn't combine scalar load to vector loads
Nadav Rotem
nrotem at apple.com
Tue May 27 22:25:41 PDT 2014
Karthik,
Please document the code that you are adding to the vectorization of binary operators.
+ buildTree_rec(Left, Depth + 1);
+ }
+ else {
Please format your patch properly.
Also, would it be better to swap Right and Left instead of duplicating the calls to buildTree_rec?
The performance numbers look okay. Do you know why we see a nice gain in the one workloads that wins?
Thanks,
Nadav
On May 27, 2014, at 9:12 PM, Karthik Bhat <kv.bhat at samsung.com> wrote:
> Hi Nadav,
> Please find the performance result with and without patch. I dont see large regression in compilation time though execution time of one test case improved greatly. The baseline is without patch and current is with patch.
>
> {F60858}
>
> Hi Raul,Arnold
> I agree that the current patch will not handle the case mentioned. I was thinking of handling unschedulable loads again after buildTree_rec was completed but as arnold mentioned i'm not sure if this would be worth the ovehead.
>
> For now do you think we can move ahead with this approach as we are able to vectorize loads similar to one's mentioned in the PR without much overhead?
>
> Thanks for all the comments and review.
>
> Regards
> Karthik Bhat
>
> http://reviews.llvm.org/D3800
>
>
More information about the llvm-commits
mailing list