[PATCH][AArch64] implement aarch64 neon load/store instructions class AdvSIMD (lselem)
Hao Liu
Hao.Liu at arm.com
Mon Oct 7 08:33:36 PDT 2013
Sorry, forgot the patch.
-----Original Message-----
From: Hao Liu [mailto:Hao.Liu at arm.com]
Sent: Monday, October 07, 2013 4:33 PM
To: 'Tim Northover'
Cc: llvm-commits; cfe-commits at cs.uiuc.edu
Subject: RE: [PATCH][AArch64] implement aarch64 neon load/store instructions
class AdvSIMD (lselem)
Hi Tim,
I add the patch with problem of DTriple/QTriple. The problem is not in
AArch64ISelDAGToDAG.cpp, it's in AArch64ISelLowering.cpp about two
functions: getRegClassFor(MVT VT) and findRepresentativeClass(MVT VT).
v4i64/v8i64 are different from other MVT types, as they are not added by
addRegisterClass(MVT::v2i64, &AArch64::FPR64RegClass). So they are illegal.
The getRegClassFor is used to map v4i64/v8i64 to a RegClass without making
them legal. findRepresentativeClass is used to get the cost of the super
registers.
The problem of DTriple/QTriple is that we don't have v3i64/v6i64, we can't
use that 2 functions in AArch64ISelLowering.cpp. So we use MVT::Untype
instead. We have to do something to MVT::Untype, or the test will fail.
I added two tests to show the problem: ld.ll and st.ll.
In SelectVLD(), I don't use MVT::Untype. I use v4i64/v8i64 for ld3.
+ else if (NumVecs == 3) {
+// ResTys.push_back(MVT::Untyped);
+ ResTys.push_back(EVT::getVectorVT(*CurDAG->getContext(),
MVT::i64, is64BitVector ? 4 : 8));
+ }
As a result, all tests including ld3 will be matched.
$ ./Debug+Asserts/bin/llc -march=aarch64 -mattr=+neon < ld.ll
In SelectVST(), I use MVT::Untype to generate DTriple/QTriple directly as
your comment. But ld.ll will fail when it is compiling st3.
$ ./Debug+Asserts/bin/llc -march=aarch64 -mattr=+neon < st.ll ...
Running pass 'AArch64 Instruction Selection' on function '@test_vst3q_s8'
Segmentation fault (core dumped)
So, I think the problem is that MVT::Untyped is not illegal and we can't
handle it the same way as v4i64/v8i64, so the test will fail if we use
MVT::Untyped.
BTW, if we make v4i64/v8i64 addRegisterClass legal by using addRegClass(),
there will also be some problem in CopyPhysReg.
So I think the easiest way to solve this problem is do something to
MVT::Untyped or just use v4i64/v8i64 to replace MVT::Untyped (But I don't
know whether this is a waste).
Do you know how to solve this, or I did something wrong?
Thanks,
-Hao
-----Original Message-----
From: Tim Northover [mailto:t.p.northover at gmail.com]
Sent: Friday, October 04, 2013 4:45 PM
To: Hao Liu
Cc: llvm-commits; cfe-commits at cs.uiuc.edu
Subject: Re: [PATCH][AArch64] implement aarch64 neon load/store instructions
class AdvSIMD (lselem)
Hi Hao,
> The second argument of DTriple should be "v3i64". As there is no
> v3i64, I have to use untyped. But then I can't get the DTripleRegClass
> in getRegClassFor(MVT VT) of AArch64ISelLowering.cpp.
Why do you think you need this? Can't you just create an untyped
REG_SEQUENCE for them? The node already seems to encode the RegisterClass
directly. I'd expect something like this to work:
SDNode *AArch64DAGToDAGISel::createDRegTripleNode(SDValue V0, SDValue V1,
SDValue V2) {
SDLoc dl(V0.getNode());
SDValue RegClass =
CurDAG->getTargetConstant(AArch64::DTripleRegClassID, MVT::i32);
SDValue SubReg0 = CurDAG->getTargetConstant(AArch64::qsub_0, MVT::i32);
SDValue SubReg1 = CurDAG->getTargetConstant(AArch64::qsub_1, MVT::i32);
const SDValue Ops[] = { RegClass, V0, SubReg0, V1, SubReg1 };
return CurDAG->getMachineNode(TargetOpcode::REG_SEQUENCE, dl,
MVT::untyped,
Ops);
}
Hopefully it will use the hint you give it there rather than asking for a
registerclass from getRegClassFor. I'm sure we can get it working somehow
without adding v3i64 (or v3v8i8 or whatever), send me a patch if you get
stuck and I'll see what I can do with it.
Cheers.
Tim.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Oct7.llvm
Type: application/octet-stream
Size: 154012 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20131007/f8a45fe7/attachment.obj>
More information about the cfe-commits
mailing list