[PATCH] Improved udivmodsi4 with support for ARMv4

Renato Golin renato.golin at linaro.org
Thu Jan 23 10:54:38 PST 2014


  So, I've been fighting this patch for a while, and here are some observations.

  1. __ARM_ARCH is not defined before GCC 4.8, so we can't rely on it. You have to do something like:

    #if defined(__ARM_ARCH_2__) || defined(__ARM_ARCH_3__) || defined(__ARM_ARCH_4__) || defined(__ARM_ARCH_3M__) || defined(__ARM_ARCH_4T__)
    #define __ARM_ARCH_OLD__
    #endif

  Then...

    #ifndef __ARM_ARCH_OLD__
    clz ip, r0
    clz r3, r1

  or, reuse something that compiler-rt does for you. GLibC does this when GCC != 4.8:

    /* The __ARM_ARCH define is provided by gcc 4.8.  Construct it otherwise.  */
    #ifndef __ARM_ARCH
    # ifdef __ARM_ARCH_2__
    #  define __ARM_ARCH 2
    # elif defined (__ARM_ARCH_3__) || defined (__ARM_ARCH_3M__)
    #  define __ARM_ARCH 3
    # elif defined (__ARM_ARCH_4__) || defined (__ARM_ARCH_4T__)
    #  define __ARM_ARCH 4
    # elif defined (__ARM_ARCH_5__) || defined (__ARM_ARCH_5E__) \
           || defined(__ARM_ARCH_5T__) || defined(__ARM_ARCH_5TE__) \
           || defined(__ARM_ARCH_5TEJ__)
    #  define __ARM_ARCH 5
    # elif defined (__ARM_ARCH_6__) || defined(__ARM_ARCH_6J__) \
           || defined (__ARM_ARCH_6Z__) || defined(__ARM_ARCH_6ZK__) \
           || defined (__ARM_ARCH_6K__) || defined(__ARM_ARCH_6T2__)
    #  define __ARM_ARCH 6
    # elif defined (__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) \
           || defined(__ARM_ARCH_7R__) || defined(__ARM_ARCH_7M__) \
           || defined(__ARM_ARCH_7EM__)
    #  define __ARM_ARCH 7
    # else
    #  error unknown arm architecture
    # endif
    #endif

  2. Even so, some of the macros are not right (inline comment).

  3. A15 works as before and, as expected, shows no difference.

  4. Because of the macros, A9 was following the ARMv4 path and breaking on many far too many cases, I'm looking into it. [ex: 8/4 = 1 (rem 4)]

  5. If I make it go down the CLZ path, it works and get some 20% performance.


================
Comment at: udivmodsi4.S:58
@@ +57,3 @@
+
+#  if defined(__ARM_ARCH_5T__) || __ARM_ARCH >= 6
+	clz	ip, r0
----------------
You mean __ARM_ARCH >= 5, right?

__ARM_ARCH_5T__ is *not* set for any other ARCH > 5T, so that's not enough for all v6, v7 and other v5s.

Since CLZ and BX LR are both v5+, I think you can safely set an __ARM_ARCH_OLD__ using my method above and check the same flag for both purposes.


http://llvm-reviews.chandlerc.com/D2595



More information about the llvm-commits mailing list