[patch] Two fixes to the vpermilvar optimization

Jim Grosbach grosbach at apple.com
Tue Apr 29 14:06:15 PDT 2014


On Apr 29, 2014, at 1:35 PM, Rafael Espíndola <rafael.espindola at gmail.com> wrote:

> On 29 April 2014 14:28, Jim Grosbach <grosbach at apple.com> wrote:
>> Generally looks good. One question.
>> 
>> +      unsigned Size = C->getNumElements();
>> +      assert(Size == 8 || Size == 4 || Size == 2);
>> +      uint32_t Indexes[8];
>> +
>> +      // The intrinsics only read one or two bits, clear the rest.
>> 
>> I don’t understand this. Under what circumstances would these bits come in as non-zero?
> 
> Something like
> 
> declare <2 x double> @llvm.x86.avx.vpermilvar.pd(<2 x double>, <2 x i64>)
> define <2 x double> @test_vpermilvar_pd(<2 x double> %v) {
>  %a = tail call <2 x double> @llvm.x86.avx.vpermilvar.pd(<2 x double>
> %v, <2 x i64> <i64 42, i64 0>)
>  ret <2 x double> %a
> }
> 
> Using a 42 in here has well defined behaviour according to the intel manual.

OK. So this is cleaning things ignored bits of the constant up so the isel matchers will fire correctly? That makes sense. Can you elaborate in the comment a bit?

patch itself LGTM.

-jim



More information about the llvm-commits mailing list