[patch] Two fixes to the vpermilvar optimization

Tue Apr 29 14:06:15 PDT 2014

On Apr 29, 2014, at 1:35 PM, Rafael Espíndola <rafael.espindola at gmail.com> wrote:

> On 29 April 2014 14:28, Jim Grosbach <grosbach at apple.com> wrote:
>> Generally looks good. One question.
>> 
>> +      unsigned Size = C->getNumElements();
>> +      assert(Size == 8 || Size == 4 || Size == 2);
>> +      uint32_t Indexes[8];
>> +
>> +      // The intrinsics only read one or two bits, clear the rest.
>> 
>> I don’t understand this. Under what circumstances would these bits come in as non-zero?
> 
> Something like
> 
> declare <2 x double> @llvm.x86.avx.vpermilvar.pd(<2 x double>, <2 x i64>)
> define <2 x double> @test_vpermilvar_pd(<2 x double> %v) {
>  %a = tail call <2 x double> @llvm.x86.avx.vpermilvar.pd(<2 x double>
> %v, <2 x i64> <i64 42, i64 0>)
>  ret <2 x double> %a
> }
> 
> Using a 42 in here has well defined behaviour according to the intel manual.

OK. So this is cleaning things ignored bits of the constant up so the isel matchers will fire correctly? That makes sense. Can you elaborate in the comment a bit?

patch itself LGTM.

-jim