[llvm-bugs] [Bug 27216] New: Clang wrongly defines __ARM_FEATURE_FMA, then generates slow emulation code

via llvm-bugs llvm-bugs at lists.llvm.org
Tue Apr 5 07:00:02 PDT 2016


https://llvm.org/bugs/show_bug.cgi?id=27216

            Bug ID: 27216
           Summary: Clang wrongly defines __ARM_FEATURE_FMA, then
                    generates slow emulation code
           Product: clang
           Version: 3.8
          Hardware: All
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Frontend
          Assignee: unassignedclangbugs at nondot.org
          Reporter: jacob.benoit.1 at gmail.com
                CC: llvm-bugs at lists.llvm.org
    Classification: Unclassified

Created attachment 16169
  --> https://llvm.org/bugs/attachment.cgi?id=16169&action=edit
testcase

*** Steps to reproduce ***

With Clang 3.8; here I'm using the toolchain in the Android N repo.

prebuilts/clang/host/linux-x86/clang-2690385/bin/clang++ -target
arm-linux-androideabi ~/vrac/arm_feature_fma_testcase.cc -c -march=armv7-a
-mfloat-abi=softfp -mfpu=neon

The 'testcase' is just checking if __ARM_FEATURE_FMA is defined. It shouldn't
be, unless we pass -mfpu=neon-vfpv4.

*** More context ***

The __ARM_FEATURE_FMA preprocessor token is supposed to be defined only when
hardware FMA instructions are defined, see
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0053b/IHI0053B_arm_c_language_extensions_2013.pdf
section 6.5.5:
"If __ARM_FEATURE_FMA and __ARM_NEON_FP are both defined, fused-multiply
instructions are available in NEON also."

These instructions are available only if VFPv4 is enabled. Thus, the expected
result should be:

-mfpu=neon        ---> __ARM_FEATURE_FMA is not defined
-mfpu=neon-vfpv4  ---> __ARM_FEATURE_FMA is defined

High-performance numerical code needs to know whether hardware FMA instructions
are available, because when they are available, they are generally faster
(thanks to not having to do the intermediate rounding).

For example, the Eigen matrix library relies on this to get faster FMA
instructions when available:
https://bitbucket.org/eigen/eigen/src/78884e16715fc9a7b726db39195ac8bb17103181/Eigen/src/Core/arch/NEON/PacketMath.h?at=default&fileviewer=file-view-default#PacketMath.h-180

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20160405/e5fb4dea/attachment.html>


More information about the llvm-bugs mailing list