On 22 Apr 2015 8:19 pm, "Hao Liu" <Hao.Liu at arm.com> wrote:
> In http://reviews.llvm.org/D9195#160109, @qcolombet wrote:
> > Hi Hao,
> >
> > I share Ahmed’s concerned and believe the scalarization should be done
as part of the SDAG legalization.
> >
> > Cheers,
> > -Quentin
> Yeah, that make sense but it seems difficult to do legalization for a
backend who doesn't support it. So I think the problem is the intrinsic
> Also, I just think maybe such new intrinsics are not necessary for
interleaved accesses.
> For the interleaved load about <4 x double>
>   <4 x double> @llvm.indexed.load.v4f64 (double* <ptr>, <4 x i32>
<index>, i32 <alignment>)
> I think we can use
>   <value> = load <4 x double>, <4 x double>* <ptr>
>   shufflevector <4 x double> <value>, <4 x double> undef, <4 x i32> <0,
2, 1, 3>
> Even though it is more complex for a backend to match two IRs into one
instruction, it is achievable. I think the disadvantage of intrinsics is
not easy to be optimized. Also, I'm always worrying about it's error prone
to allow an index vector with arbitrary elements.
> I'll try to implement the vectorization on interleaved memory access with
vectorload/vectorstore+shufflevector. If it is achievable, I think it is
better than new intrinsics.
> What do you think?

If you can, I think that will be a better option.

