[PATCH] Add new indexed load/store intrinsics
Hao Liu
Hao.Liu at arm.com
Wed Apr 22 12:19:12 PDT 2015
In http://reviews.llvm.org/D9195#160109, @qcolombet wrote:
> Hi Hao,
>
> I share Ahmed’s concerned and believe the scalarization should be done as part of the SDAG legalization.
>
> Cheers,
> -Quentin
Yeah, that make sense but it seems difficult to do legalization for a backend who doesn't support it. So I think the problem is the intrinsic itself.
Also, I just think maybe such new intrinsics are not necessary for interleaved accesses.
For the interleaved load about <4 x double>
<4 x double> @llvm.indexed.load.v4f64 (double* <ptr>, <4 x i32> <index>, i32 <alignment>)
I think we can use
<value> = load <4 x double>, <4 x double>* <ptr>
shufflevector <4 x double> <value>, <4 x double> undef, <4 x i32> <0, 2, 1, 3>
Even though it is more complex for a backend to match two IRs into one instruction, it is achievable. I think the disadvantage of intrinsics is not easy to be optimized. Also, I'm always worrying about it's error prone to allow an index vector with arbitrary elements.
I'll try to implement the vectorization on interleaved memory access with vectorload/vectorstore+shufflevector. If it is achievable, I think it is better than new intrinsics.
What do you think?
Thanks,
-Hao
http://reviews.llvm.org/D9195
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list