[PATCH] Add new indexed load/store intrinsics
Hao.Liu at arm.com
Wed Apr 22 06:30:19 PDT 2015
Hi delena, rengolin, ab, qcolombet,
According to the comments in D8820 (Teach Loop Vectorizer about interleaved data accesses), i split that patch and this is the first patch to add support for the new intrinsics:
<4 x double> @llvm.indexed.load.v4f64 (double* <ptr>, <4 x i32> <index>, i32 <alignment>)
void @llvm.indexed.store.v4f64 (<4 x double> <value>, double* <ptr>, <4 x i32> <index>, i32 <alignment>)
Such intrinsics can be used as interleaved load/store, strided load/store, etc.
I just a bit worry about the name of "indexed". Actually there is already indexed load/store name used for load/store with indexed memory mode (the pre-incremental, post-incremental, pre-dec...). There is also masked load/store for prediction load/store. I can't find a better name for load/store with indices. If you think this name is confusing, I can change it. How about indexed.gather and indexed.scatter like Elena used in D7665?
The implementation is like the masked load/store. This patch mainly about:
(1) Add two new intrinsics and modify the LangRef.rst
(2) Add code generator for the new intrinsics. Add AArch64 backend codegen for the interleaved load/store, which is a subset of the indexed load/store.
(3) Teach the CodeGenPrepare to scalarize unsupported unsupported indexed load/store.
There is no code in the Legalization phase, as the AArch64 backend can not support other indexed load/store except interleaved load/store. Even if I add such code, I can not test. Anyway, the CodeGenPrepare can handle the unsupported cases.
There are TODOs in the CodeGenPrepare, some indexed load/store can be transfered into "a VectorLoad + a SuffleVector" or "a ShuffleVector + a VectorStore".
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 48918 bytes
Desc: not available
More information about the llvm-commits