[patch] Two fixes to the vpermilvar optimization
Rafael Espíndola
rafael.espindola at gmail.com
Tue Apr 29 13:35:36 PDT 2014
On 29 April 2014 14:28, Jim Grosbach <grosbach at apple.com> wrote:
> Generally looks good. One question.
>
> + unsigned Size = C->getNumElements();
> + assert(Size == 8 || Size == 4 || Size == 2);
> + uint32_t Indexes[8];
> +
> + // The intrinsics only read one or two bits, clear the rest.
>
> I don’t understand this. Under what circumstances would these bits come in as non-zero?
Something like
declare <2 x double> @llvm.x86.avx.vpermilvar.pd(<2 x double>, <2 x i64>)
define <2 x double> @test_vpermilvar_pd(<2 x double> %v) {
%a = tail call <2 x double> @llvm.x86.avx.vpermilvar.pd(<2 x double>
%v, <2 x i64> <i64 42, i64 0>)
ret <2 x double> %a
}
Using a 42 in here has well defined behaviour according to the intel manual.
Cheers,
Rafael
More information about the llvm-commits
mailing list