[Patch] [SLPVectorizer] Vectorize Horizontal Reductions from Consecutive Loads
Suyog Kamal Sarda
suyog.sarda at samsung.com
Fri Dec 12 05:03:44 PST 2014
Committed in r224119.
Thanks a lot for the review.
- Suyog
------- Original Message -------
Sender : Suyog Kamal Sarda<suyog.sarda at samsung.com> Senior Software Engineer/SRI-Bangalore-TZN/Samsung Electronics
Date : Dec 12, 2014 21:38 (GMT+09:00)
Title : Re: Re: [Patch] [SLPVectorizer] Vectorize Horizontal Reductions from Consecutive Loads
Hi Nadav,
I ran LNT on x86 with 10 iterations and saw only one regression in performance in
test case : MultiSource/Benchmarks/Prolangs-C++/fsm/fsm
However, this test case doesn't seem to be relevant to my vectorization patch
(I checked the TC) and hence I am ignoring it and going ahead with the commit
as suggested by you.
Attaching Screenshots of LNT results.
Regards,
Suyog
------- Original Message -------
Sender : suyog sarda
Date : Dec 12, 2014 04:10 (GMT+09:00)
Title : Re: [Patch] [SLPVectorizer] Vectorize Horizontal Reductions from Consecutive Loads
Hi Nadav,
Thanks for reviewing the patch. I will upload the performance results by tomorrow.
Just to be sure, you meant LNT test suite performance results, right?
On Thu, Dec 11, 2014 at 10:32 PM, Nadav Rotem wrote:
Hi Suyog,
The change looks good to me. I think that it would be a good idea to run the LLVM test suite and check if there there are any performance regressions.
Thanks,
Nadav
> On Dec 11, 2014, at 4:38 AM, Suyog Kamal Sarda wrote:
>
> Hi All,
>
> This patch recognizes (+ (+ v0, v1) (+ v2, v3)), reorders them for bundling into vector of loads,
> and vectorizes it. Earlier as discussed in LLVM mail threads, we didn't vectorize such horizontal reductions.
>
> Test case :
>
> float hadd(float* a) {
> return (a[0] + a[1]) + (a[2] + a[3]);
> }
>
>
> AArch64 assembly before patch :
>
> ldp s0, s1, [x0]
> ldp s2, s3, [x0, #8]
> fadd s0, s0, s1
> fadd s1, s2, s3
> fadd s0, s0, s1
> ret
>
> AArch64 assembly after patch :
>
> ldp d0, d1, [x0]
> fadd v0.2s, v0.2s, v1.2s
> faddp s0, v0.2s
> ret
>
> More work of recognizing (+(+(+ v0, v1) v2) v3) still remains. I will come up with this in another patch.
>
> Please help in reviewing the patch. No 'make-check' failures observed with this patch.
>
> (Would have preferred Phabricator, but its not working and hence sending via e-mail)
>
> Regards,
> Suyog
_______________________________________________
llvm-commits mailing list
llvm-commits at cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
--
With regards,
Suyog Sarda
More information about the llvm-commits
mailing list