[PATCHSET][ARM64] Implement big endian NEON

Thu May 1 09:02:30 PDT 2014

Hi Tim,

Attached are updated patches up to and including the tests for the bitconverts. I'll address your comments on the call/return logic in the coming days, but you seem to have pretty much approved all the patches up to there, so it seems to make sense to make hay while the sun is shining / I'm in the office.

I made all the changes I indicated below, and added a FIXME for byval on fundamental types as we discussed on IRC.

Cheers,
James

> -----Original Message-----
> From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-
> bounces at cs.uiuc.edu] On Behalf Of James Molloy
> Sent: 01 May 2014 16:24
> To: Tim Northover
> Cc: llvm-commits
> Subject: RE: [PATCHSET][ARM64] Implement big endian NEON
>
> Hi Tim,
>
> Thanks for reviewing!
>
> > > 1.       Creation of a targetlowering hook to control if bitcasts can be
> > > elided, and wiring of this up for ARM64.
> >
> > I don't think this patch is needed at all. It only affects
> > load/bitcast & bitcast/store pairs and the elision is correct there by
> > definition of bitcast (i.e. bitcast == store/bitcast*/load in the
> > language reference).
>
> I agree, as I mentioned in IRC. This one was probably me being incorrect
> having
> not got my head fully around the problem.
>
> > > 2.       Fixing of big endian argument passing (non-NEON specific).
> >
> > I'm unconvinced by the byval changes here.
>
> I must admit that the byval part of this patch was ported straight over from
> AArch64 without my thinking too much about its validity. I agree with you
> and will remove the byval alignment part.
>
> > > 3.       [trivial] disconnection of bitcast combine in ISelLowering
> >
> > I'll be particularly interested to see the test for this one. It feels
> > like there ought to be some simple commutation relationship between
> > bitcast & extract, but I can't see it.
>
> This one has no testcase. I looked at it and thought it looked dodgy, so I
> disconnected it. Now that the whole thing is working, I've been able to run
> my tests without this optimization disabled and there are no failures, so I'm
> feeling a bit more confident about it and am happy to reenable it until we
> get a good reason not to.
>
> > > 5.       The Big One ™. Implementation of bitcast for ARM64BE. A good
> > > explanation is available in the patch commit log.
> >
> > As you mentioned, the EXT patterns are iffy (the immediate should be a
> > byte index, not a bit index).
>
> Yep, fixed locally.
>
> > Do we ever get an EXTLOAD of a vector type like this during
> > call/return? I hope not, and have lots more worries about the code if
> > we do.
>
>  I hope not too. I can't think of a case where we would, but I can add an
> assert to
> make certain we catch it if we do. I think I've tried pretty much all
> permutations
> of vector argument types now and not come across anything...
>
> > I think this patch as a whole might be simplified if we drag
> > ARM64CallingConvention.td into the fray. Some rule like
> >
> >     CCIfBigEndian<CCIfType<[...], CCBitConvertToType<f128>>>
>
> Interesting - I'll have a fiddle and get back to you.
>
> > You might want to check variadic functions too. va_arg has a
> > reasonable chance of being correct without any changes (it's
> > implemented in Clang for AAPCS), but saveVarArgRegisters looks wrong
> > to me: "DAG.getCopyFromReg(..., MVT::v2i64)".
>
> Yep, I have a trivial fix for that locally too (s/v2i64/f128/).
>
> Cheers,
>
> James
>
> > -----Original Message-----
> > From: Tim Northover [mailto:t.p.northover at gmail.com]
> > Sent: 01 May 2014 15:44
> > To: James Molloy
> > Cc: llvm-commits
> > Subject: Re: [PATCHSET][ARM64] Implement big endian NEON
> >
> > Hi James,
> >
> > Thanks for working on this.
> >
> > > 1.       Creation of a targetlowering hook to control if bitcasts can be
> > > elided, and wiring of this up for ARM64.
> >
> > I don't think this patch is needed at all. It only affects
> > load/bitcast & bitcast/store pairs and the elision is correct there by
> > definition of bitcast (i.e. bitcast == store/bitcast*/load in the
> > language reference).
> >
> > > 2.       Fixing of big endian argument passing (non-NEON specific).
> >
> > I'm unconvinced by the byval changes here. I don't think Clang uses
> > byval on ARM64 so they should have no effect on ABI compatibility, but
> > consider this function:
> >
> >     define i32 @foo([8 x i64], { i32 }* byval %foo) {
> >       %eltptr = getelementptr { i32 }* %foo, i32 0, i32 0
> >       %val = load i32* %eltptr
> >       ret i32 %val
> >     }
> >
> > I think that with a natural reading of the AAPCS in LLVM terms, the
> > value returned should be loaded from [sp] (via clause C.13).
> >
> > > 3.       [trivial] disconnection of bitcast combine in ISelLowering
> >
> > I'll be particularly interested to see the test for this one. It feels
> > like there ought to be some simple commutation relationship between
> > bitcast & extract, but I can't see it.
> >
> > > 4.       [mechanical] Predication of LDR/STR (on vectors) to be
> > > little-endian-only.
> >
> > This looks fine modulo tests.
> >
> > > 5.       The Big One ™. Implementation of bitcast for ARM64BE. A good
> > > explanation is available in the patch commit log.
> >
> > As you mentioned, the EXT patterns are iffy (the immediate should be a
> > byte index, not a bit index).
> >
> > > 6.       Implementation of call/ret lowering and formal arguments lowering.
> >
> > +// If VT is a 64-bit vector, return f64. If VT is a 128-bit vector,
> > return f128.
> > +// If VT is neither of these, return VT unchanged.
> >
> > Doxygen comments here couldn't hurt.
> >
> > +        ArgValue = DAG.getExtLoad(ISD::EXTLOAD, DL, LoadVT, Chain, FIN,
> > +                                  MachinePointerInfo::getFixedStack(FI),
> > +                                  VA.getLocVT(),
> > +                                  false, false, false, 0);
> >
> > Do we ever get an EXTLOAD of a vector type like this during
> > call/return? I hope not, and have lots more worries about the code if
> > we do.
> >
> > I think this patch as a whole might be simplified if we drag
> > ARM64CallingConvention.td into the fray. Some rule like
> >
> >     CCIfBigEndian<CCIfType<[...], CCBitConvertToType<f128>>>
> >
> > near the top (you'd have to define CCIfBigEndian using plain CCIf, I
> > think; you could get away without it semantically, but the LE DAGs
> > would be nastier).
> >
> > After that you could make the BITCASTs unconditional (trivial ones get
> > elided immediately: SelectionDAG.cpp:2750).
> >
> > > The only thing missing at the moment except the testcases is fast-isel
> > > support. That’ll come in a later commit.
> >
> > You might want to check variadic functions too. va_arg has a
> > reasonable chance of being correct without any changes (it's
> > implemented in Clang for AAPCS), but saveVarArgRegisters looks wrong
> > to me: "DAG.getCopyFromReg(..., MVT::v2i64)".
> >
> > Cheers.
> >
> > Tim.
>
>
> -- IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended recipient,
> please notify the sender immediately and do not disclose the contents to any
> other person, use it for any purpose, or store or copy the information in any
> medium.  Thank you.
>
> ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
> Registered in England & Wales, Company No:  2557590
> ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
> Registered in England & Wales, Company No:  2548782
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium.  Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No:  2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No:  2548782
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-ARM64-BE-Make-big-endian-scalar-argument-passing-wor.patch
Type: application/octet-stream
Size: 11822 bytes
Desc: 0001-ARM64-BE-Make-big-endian-scalar-argument-passing-wor.patch
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140501/0aa3ffe5/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-ARM64-BE-Predicate-VLDR-VSTR-for-vectors-as-little-e.patch
Type: application/octet-stream
Size: 18793 bytes
Desc: 0002-ARM64-BE-Predicate-VLDR-VSTR-for-vectors-as-little-e.patch
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140501/0aa3ffe5/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-ARM64-BE-Implement-the-crazy-bitcast-handling-for-bi.patch
Type: application/octet-stream
Size: 24539 bytes
Desc: 0003-ARM64-BE-Implement-the-crazy-bitcast-handling-for-bi.patch
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140501/0aa3ffe5/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0004-ARM64-BE-Add-a-missing-bitconvert-pattern-v16i8-coul.patch
Type: application/octet-stream
Size: 1717 bytes
Desc: 0004-ARM64-BE-Add-a-missing-bitconvert-pattern-v16i8-coul.patch
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140501/0aa3ffe5/attachment-0003.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0005-ARM64-BE-Add-tests-for-vector-bitcasts.patch
Type: application/octet-stream
Size: 32195 bytes
Desc: 0005-ARM64-BE-Add-tests-for-vector-bitcasts.patch
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140501/0aa3ffe5/attachment-0004.obj>