[LLVMdev] ARM NEON intrinsics in clang

Stanislav Manilov S.Z.Manilov at sms.ed.ac.uk
Thu Sep 26 09:45:02 PDT 2013

Hello Tim,

> I spent the last three days trying to compile a version of LLVM that would
> > allow me to compile sources that contain these intrinsics, but with no
> success.
> Ok. This we can probably help with. Did you manage to build a version
> of Clang (preferably from git/subversion)?

Yes, I managed to build the latest (r191291) svn revision of LLVM + clang.

If so, you're probably having problems cross-compiling. Renato's
> recently worked on some documentation in this area:
> http://clang.llvm.org/docs/CrossCompilation.html.
> But for a quick hack, you could try:
> $ cat > neon.c
> #include <arm_neon.h>
> float32x4_t my_func(float32x4_t lhs, float32x4_t rhs) {
>   return vaddq_f32(lhs, rhs);
> }
> $ clang --target=arm-linux-gnueabihf -mcpu=cortex-a15 -ffreestanding
> -O3 -S -o - neon.c
> ("ffreestanding" will dodge any issues with your supporting toolchain,
> but won't work for larger tests. You've got to actually solve the
> issues before you start running code).

This works, which is great! My confusion came from not knowing the
combination of flags for cross-compiling for ARM, and for getting "#error
"NEON support not enabled"" when getting it wrong, which combined with the
outdated knowledge of the internet lead me to believe that NEON is not

I will read that cross compilation guide before asking further questions
about this set of flags.

> > In the process I found out that clang doesn't support NEON (as per
> > http://blog.llvm.org/2010/04/arm-advanced-simd-neon-intrinsics-and.html
> ),
> That's rather out of date, I'm afraid. 32-bit ARM does support both
> NEON intrinsics and a reasonable amount of LLVM's own
> auto-vectorisation (which is in its early stages, but we have some
> kind of loop and SLP vectorisation going on).
> > but there has been at least some effort in adding it
> > (
> https://www.codeaurora.org/patches/quic/llvm/32040/clang-Initial-Neon-support.patch
> ).
> That patch is part of the effort to implement NEON (instructions and
> intrinsics) on the 64-bit ARM architecture (AArch64).

Great! It seemed quite confusing that this is the main official information
one gets when searching for "arm neon clang llvm", especially when parts of
the documentation (
http://clang.llvm.org/docs/LanguageExtensions.html#langext-vectors) claim
that clang supports NEON. I am happy that it actually does.

> > I also tried compiling LLVM 2.9 + llvm-gcc but that failed too many times
> > and I gave up.
> Yep. llvm-gcc is long dead, and LLVM 2.9 isn't much healthier.

I was thinking it was just me being a noob.

> current plan is to implement the ARM NEON intrinsics as a shared library,
> > using attributes as in:

That would probably be possible, but very bad from a performance
> perspective. The whole point of NEON intrinsics is to speed up vector
> code; if you've got the overhead of a call/return for each intrinsic
> and completely fixed registers around even that you'll be in for a
> world of pain.
> > Ideally, I want to be able to compile C code that includes ARM NEON
> > intrinsics to other targets (TI processors, e.g.).
> Now that's going to be harder. LLVM itself doesn't support any TI
> processors, for a start. And many of the NEON intrinsics (those with
> more complex semantics) compile to LLVM IR with LLVM-level intrinsics,
> which are only supported in the ARM backend.
> Your shared library idea would work semantically, of course. But I'm
> not sure what useful information could be extracted from it.

That was my plan for adding NEON support in clang, which as I know now has
been thankfully done by someone who is more aware of how the platform
works. The TI processors were a bad example, PowerPC is maybe a better one,
as I just checked and there is a backend in LLVM for such processors. My
current goal is exactly to add support for such LLVM-level intrinsics to a
non-ARM backend, in order to make ARM-specific C code (one that contains
NEON intrinsics) compilable for another target.

Thanks a lot for your time and help. I will try to setup my cross
compilation toolchain and ask again if I get seriously stuck.

 - Stan

Stan Manilov
1st year Ph.D. student
2013 Graduate in BSc Computer Science and Mathematics
The University of Edinburgh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130926/7b4c409a/attachment.html>

More information about the llvm-dev mailing list