[PATCH] Add new indexed load/store intrinsics.

Thu Apr 23 00:28:44 PDT 2015

On 23 Apr 2015 3:38 am, "Hao Liu" <Hao.Liu at arm.com> wrote:
>
> Hi Renato and Ahmed,
>
> I agree with your comments.
>
> But I want to change the plan. Because I think maybe there is no need to
use intrinsics.
> For the interleaved load about <4 x double>
>
>   <4 x double> @llvm.indexed.load.v4f64 (double* <ptr>, <4 x i32>
<index>, i32 <alignment>)
>
> I think we can use two common IRs:
>
>   <value> = load <4 x double>, <4 x double>* <ptr>
>   shufflevector <4 x double> <value>, <4 x double> undef, <4 x i32> <0,
2, 1, 3>
>
> Even though it is more complex for a backend to match two IRs, it is
achievable. I think the disadvantage of  intrinsics is not easy to be
optimized.
>
> I want to implement the loop vectorization on interleaved memory access
with vectorload/vectorstore+shufflevector.
>
> What do you think?

I agree. If it's possible to represent it in plain IR, I see no reason to
not do it.

I'll be particularly interested in how other passes scramble the accesses,
making the pattern irrecoverable. But I guess will find that out as you
progress with the examples and tests.

Cheers,
Renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150423/610a60ff/attachment.html>