[LLVMdev] ARM NEON intrinsics in clang

Stanislav Manilov S.Z.Manilov at sms.ed.ac.uk
Thu Sep 26 09:45:02 PDT 2013


Hello Tim,

> I spent the last three days trying to compile a version of LLVM that would
> > allow me to compile sources that contain these intrinsics, but with no
> success.
>
> Ok. This we can probably help with. Did you manage to build a version
> of Clang (preferably from git/subversion)?
>

Yes, I managed to build the latest (r191291) svn revision of LLVM + clang.

If so, you're probably having problems cross-compiling. Renato's
> recently worked on some documentation in this area:
> http://clang.llvm.org/docs/CrossCompilation.html.
>
> But for a quick hack, you could try:
>
> $ cat > neon.c
> #include <arm_neon.h>
>
> float32x4_t my_func(float32x4_t lhs, float32x4_t rhs) {
>   return vaddq_f32(lhs, rhs);
> }
> $ clang --target=arm-linux-gnueabihf -mcpu=cortex-a15 -ffreestanding
> -O3 -S -o - neon.c
>
> ("ffreestanding" will dodge any issues with your supporting toolchain,
> but won't work for larger tests. You've got to actually solve the
> issues before you start running code).
>

This works, which is great! My confusion came from not knowing the
combination of flags for cross-compiling for ARM, and for getting "#error
"NEON support not enabled"" when getting it wrong, which combined with the
outdated knowledge of the internet lead me to believe that NEON is not
supported.

I will read that cross compilation guide before asking further questions
about this set of flags.


>
> > In the process I found out that clang doesn't support NEON (as per
> > http://blog.llvm.org/2010/04/arm-advanced-simd-neon-intrinsics-and.html
> ),
>
> That's rather out of date, I'm afraid. 32-bit ARM does support both
> NEON intrinsics and a reasonable amount of LLVM's own
> auto-vectorisation (which is in its early stages, but we have some
> kind of loop and SLP vectorisation going on).
>
> > but there has been at least some effort in adding it
> > (
> https://www.codeaurora.org/patches/quic/llvm/32040/clang-Initial-Neon-support.patch
> ).
>
> That patch is part of the effort to implement NEON (instructions and
> intrinsics) on the 64-bit ARM architecture (AArch64).
>

Great! It seemed quite confusing that this is the main official information
one gets when searching for "arm neon clang llvm", especially when parts of
the documentation (
http://clang.llvm.org/docs/LanguageExtensions.html#langext-vectors) claim
that clang supports NEON. I am happy that it actually does.


> > I also tried compiling LLVM 2.9 + llvm-gcc but that failed too many times
> > and I gave up.
>
> Yep. llvm-gcc is long dead, and LLVM 2.9 isn't much healthier.
>

I was thinking it was just me being a noob.

> current plan is to implement the ARM NEON intrinsics as a shared library,
> > using attributes as in:
>

>
That would probably be possible, but very bad from a performance
> perspective. The whole point of NEON intrinsics is to speed up vector
> code; if you've got the overhead of a call/return for each intrinsic
> and completely fixed registers around even that you'll be in for a
> world of pain.
>
> > Ideally, I want to be able to compile C code that includes ARM NEON
> > intrinsics to other targets (TI processors, e.g.).
>
> Now that's going to be harder. LLVM itself doesn't support any TI
> processors, for a start. And many of the NEON intrinsics (those with
> more complex semantics) compile to LLVM IR with LLVM-level intrinsics,
> which are only supported in the ARM backend.
>
> Your shared library idea would work semantically, of course. But I'm
> not sure what useful information could be extracted from it.
>

That was my plan for adding NEON support in clang, which as I know now has
been thankfully done by someone who is more aware of how the platform
works. The TI processors were a bad example, PowerPC is maybe a better one,
as I just checked and there is a backend in LLVM for such processors. My
current goal is exactly to add support for such LLVM-level intrinsics to a
non-ARM backend, in order to make ARM-specific C code (one that contains
NEON intrinsics) compilable for another target.

Thanks a lot for your time and help. I will try to setup my cross
compilation toolchain and ask again if I get seriously stuck.

Cheers,
 - Stan


-- 
Stan Manilov
1st year Ph.D. student
2013 Graduate in BSc Computer Science and Mathematics
The University of Edinburgh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130926/7b4c409a/attachment.html>


More information about the llvm-dev mailing list