[LLVMdev] Should more vector [zs]extloads be legal for X86 SSE4.1?
ahmed.bougacha at gmail.com
Thu Dec 4 11:02:36 PST 2014
On Tue, Dec 2, 2014 at 8:21 PM, Chandler Carruth <chandlerc at gmail.com> wrote:
> On Tue, Dec 2, 2014 at 2:24 PM, Ahmed Bougacha <ahmed.bougacha at gmail.com>
>> Hi Chandler, all,
>> Why aren't the vector [zs]extloads introduced by SSE4.1/AVX2 declared
>> legal? Is it a simple oversight, or did I miss a deeper reason?
> While hacking on this, I tried to make them legal, and failed. I don't
> recall everything that went wrong though, and perhaps you'll have better
> luck than I did.
So I gave it a shot, but to avoid an ugly hack in X86, I tried to do
the right thing and changed the extload legalization logic to have a
2D array (memory->result types, not just memory), which just seems
like the saner thing to do:
The X86 stuff is much simpler:
>> While cleaning up PMOV*X patterns, I stumbled upon this braindead
>> %0 = load <8 x i8>* %src, align 1
>> %1 = zext <8 x i8> %0 to <8 x i16>
>> turning into:
>> pmovzxbw (%rsi), %xmm0
>> pand <0xff,0xff,...>, %xmm0, %xmm0
>> v8i8 isn't legal, so the load became an anyext load from v8i8 to
>> v8i16, with the pand masking out the unwanted/zero bits.
> I've seen this too. It's horrible.
>> In that example, if you declare zextloads from v8i8 legal, and add the
>> simple corresponding pattern, the pand isn't generated anymore, as
> Won't type legalization insist on legalizing the <8 x i8> type even though
> we can do the extload? My memory of this is very dim. If this "just works"
> as expected, then by all means, lets do it.
>From what I've seen, not if the illegal type is only an extload's (or
truncstore's) memory type: then, anything is OK.
> Speaking of which, I should actually go nuke the old shuffle lowering. Some
> of my problems may have only been problems with it.
More information about the llvm-dev