[PATCH] [ARM64/AArch64] Port NEON post-increment load/store with 2/3/4 vectors to ARM64 backend

Hao Liu Hao.Liu at arm.com
Mon May 5 01:18:39 PDT 2014


Hi t.p.northover,

Hi Tim and other reviewers,

This patch ports all NEON post-increment load/store with 2/3/4 vectors to ARM64 backend, including following post increment instructions:
	LD1 - Load multiple 1-element structures to two, three or four consecutive registers
	LD2 - Load multiple 2-element structures to two consecutive registers
	LD3 - Load multiple 3-element structures to three consecutive registers
	LD4 - Load multiple 4-element structures to four consecutive registers
	LD2 - Load single 2-element structure to one lane of two consecutive registers
	LD3 - Load single 3-element structure to one lane of three consecutive registers
	LD4 - Load single 4-element structure to one lane of four consecutive registers
	LD2R - Load single 2-element structure and replicate to all lanes of two registers
	LD3R - Load single 3-element structure and replicate to all lanes of three registers
	LD4R - Load single 4-element structure and replicate to all lanes of four registers 
	ST1 - Store multiple 1-element structures from two, three or four consecutive registers
	ST2 - Store multiple 2-element structures from two consecutive registers
	ST3 - Store multiple 3-element structures from three consecutive registers
	ST4 - Store multiple 4-element structures from four consecutive registers
	ST2 - Store single 2-element structure from one lane of two consecutive registers
	ST3 - Store single 3-element structure from one lane of three consecutive registers
	ST4 - Store single 4-element structure from one lane of four consecutive registers

BTW, I just think the implementation in ARM64DAGToDAGISel::Select has some redundancy. Every time for an intrinsic/ISDNode, it compare to 12 types from v16i8 to v2f64 and call corresponding select function such as SelectLoad, SelectStore. If we call SelectLoad/SelectStore directly and compare the types inside, we can reduce some code. Anyway, it just something about code structure and has nothing to do with correctness. I don't modify it. I just use the same way to call SelectPostLoad as call SelectLoad.

Code review, please.

Thanks,
-Hao

http://reviews.llvm.org/D3605

Files:
  lib/Target/ARM64/ARM64ISelDAGToDAG.cpp
  lib/Target/ARM64/ARM64ISelLowering.cpp
  lib/Target/ARM64/ARM64ISelLowering.h
  test/CodeGen/ARM64/indexed-vector-ldst.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D3605.9064.patch
Type: text/x-patch
Size: 276211 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140505/b7531b0b/attachment.bin>


More information about the llvm-commits mailing list