[LLVMdev] Should more vector [zs]extloads be legal for X86 SSE4.1?
chandlerc at gmail.com
Tue Dec 2 20:21:09 PST 2014
On Tue, Dec 2, 2014 at 2:24 PM, Ahmed Bougacha <ahmed.bougacha at gmail.com>
> Hi Chandler, all,
> Why aren't the vector [zs]extloads introduced by SSE4.1/AVX2 declared
> legal? Is it a simple oversight, or did I miss a deeper reason?
While hacking on this, I tried to make them legal, and failed. I don't
recall everything that went wrong though, and perhaps you'll have better
luck than I did.
> While cleaning up PMOV*X patterns, I stumbled upon this braindead testcase:
> %0 = load <8 x i8>* %src, align 1
> %1 = zext <8 x i8> %0 to <8 x i16>
> turning into:
> pmovzxbw (%rsi), %xmm0
> pand <0xff,0xff,...>, %xmm0, %xmm0
> v8i8 isn't legal, so the load became an anyext load from v8i8 to
> v8i16, with the pand masking out the unwanted/zero bits.
I've seen this too. It's horrible.
> In that example, if you declare zextloads from v8i8 legal, and add the
> simple corresponding pattern, the pand isn't generated anymore, as
Won't type legalization insist on legalizing the <8 x i8> type even though
we can do the extload? My memory of this is very dim. If this "just works"
as expected, then by all means, lets do it.
Speaking of which, I should actually go nuke the old shuffle lowering. Some
of my problems may have only been problems with it.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev