[LLVMdev] NEON intrinsics preventing redundant load optimization?
Simon Taylor
simontaylor1 at ntlworld.com
Tue Jan 13 08:05:11 PST 2015
> Ok, I raised the priority back to Normal in the bug, since the work around wasn't good enough.
Sorry, looks like a false alarm; I was missing a set of brackets in my macro definitions for vld/vst and they weren’t rebuilt properly after being fixed. This does then generate correct code with both GCC and LLVM, and in LLVM all the redundant load/stores are correctly removed.
I’ve still sufficiently scared myself about aliasing, and GCC still suffers from some additional temporaries, so I’m going to use a lazy-evaluation template approach instead. That will lose the natural support for chained multiply that the simple return by value offers but I have more confidence in the correctness of the approach than the workaround I previously suggested.
Simon
> Cheers,
> Renato
>
> On 13 Jan 2015 11:57, "Simon Taylor" <simontaylor1 at ntlworld.com <mailto:simontaylor1 at ntlworld.com>> wrote:
>
> > On 5 Jan 2015, at 13:08, Renato Golin <renato.golin at linaro.org <mailto:renato.golin at linaro.org>> wrote:
> >
> > On 5 January 2015 at 12:13, James Molloy <james at jamesmolloy.co.uk <mailto:james at jamesmolloy.co.uk>> wrote:
> >> For this reason Renato I don't think we should advise people to work around
> >> the API, as who knows what problems that will cause later.
> >
> > I stand corrected (twice). But we changed the subject a bit, so things
> > got different midway through.
> >
> > There were only two codes: full C, with pointers, vectorized some
> > times; and full NEON, with extra loads. The mix between the two came
> > later, as a quick fix. My comment to that, I agree, was misplaced, and
> > both you and Tim are correct is asking for it not to be used as a
> > final solution. But as an interim solution, with the care around
> > alignment, endian and all, it is "fine”.
>
> After a bit more testing applying the pointer dereferencing workaround in my real code (which has some layers of templates and inline functions) I’ve decided against using it in practice.
>
> GCC produced incorrect code unless -fno-strict-aliasing was specified. It’s probably entirely entitled to do that, so it just seems too flaky to recommend in any case.
>
> Simon
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150113/fff36b70/attachment.html>
More information about the llvm-dev
mailing list