[PATCH] Implement aarch64 neon instruction class SIMD lsone and lsone-post - LLVM

Thu Nov 21 05:14:19 PST 2013

Hi Hao,

Sorry for the delay replying to this one.

> I have implemented in the front-end for vld2_lane. But I'm not sure whether the solution is correct. There are many shufflevectors, because we need to transfer the input from 64bit vector to 128bit, and also transfer the output from 128bit to 64bit.

Yep. But, importantly, the optimiser can see through those
shufflevectors if it needs to: without them you're just hiding the
complexity in one monolithic intrinsic for an instruction that doesn't
even exist really.

The Clang code looks like it's doing roughly what I'd expect, though
the LLVM code still handles the odd cases -- I assume removing that
will be part of finishing.

Cheers.

Tim.