[PATCH] Add new indexed load/store intrinsics

Wed Apr 22 12:19:12 PDT 2015

In http://reviews.llvm.org/D9195#160109, @qcolombet wrote:

> Hi Hao,
>
> I share Ahmed’s concerned and believe the scalarization should be done as part of the SDAG legalization.
>
> Cheers,
> -Quentin

Yeah, that make sense but it seems difficult to do legalization for a backend who doesn't support it. So I think the problem is the intrinsic itself.

Also, I just think maybe such new intrinsics are not necessary for interleaved accesses. 
For the interleaved load about <4 x double>

  <4 x double> @llvm.indexed.load.v4f64 (double* <ptr>, <4 x i32> <index>, i32 <alignment>)

I think we can use

  <value> = load <4 x double>, <4 x double>* <ptr>
  shufflevector <4 x double> <value>, <4 x double> undef, <4 x i32> <0, 2, 1, 3>

Even though it is more complex for a backend to match two IRs into one instruction, it is achievable. I think the disadvantage of intrinsics is not easy to be optimized. Also, I'm always worrying about it's error prone to allow an index vector with arbitrary elements.

I'll try to implement the vectorization on interleaved memory access with vectorload/vectorstore+shufflevector. If it is achievable, I think it is better than new intrinsics.

What do you think?

Thanks,
-Hao

http://reviews.llvm.org/D9195

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/