[LLVMdev] NEON intrinsics preventing redundant load optimization?
Renato Golin
renato.golin at linaro.org
Sun Dec 7 16:13:58 PST 2014
On 7 December 2014 at 19:15, Simon Taylor <simontaylor1 at ntlworld.com> wrote:
> Is there something about the use of intrinsics that prevents the compiler optimizing out the redundant store on the stack? Is there any hope for this improving in the future, or anything I can do now to improve the generated code?
If I had to guess, I'd say the intrinsic got in the way of recognising
the pattern. vmulq_f32 got correctly lowered to IR as "fmul", but
vld1q_f32 is still kept as an intrinsic, so register allocators and
schedulers get confused and, when lowering to assembly, you're left
with garbage around it.
Creating a bug for this is probably the best thing to do, since this
is a common pattern that needs looking into to produce optimal code.
cheers,
--renato
More information about the llvm-dev
mailing list