[PATCH] Improved udivmodsi4 with support for ARMv4
Joerg Sonnenberger
joerg at NetBSD.org
Thu Jan 23 04:43:33 PST 2014
Tim:
Just calling them in a loop with incrementing numerator and constant denumerator, I get the following timing on a ARM1176JZ-S (2nd generation rpi).
| denumerator | old code | new code (-march=armv6) | new code (-march=armv4) |
| 65534 | 9.43 | 3.53 | 4.06 |
| 16 | 9.58 | 3.52 | 4.06 |
| 128 | 8.17 | 3.21 | 3.76 |
================
Comment at: udivmodsi4.S:62
@@ +61,3 @@
+ sub r3, r3, ip
+ /* r1 > r0 implies r3 >= 0. */
+ adr ip, LOCAL_LABEL(div0block)
----------------
Renato Golin wrote:
> I'd have thought that the code would only get here if r0 >= r1, because of the BCC above.
>
> If r1 > r0, this is a case for quotient0, no?
quotient0 is used for r0 < r1, so at this point r1 >= r0. This means r3 >= ip and therefore r3-ip >= 0.
http://llvm-reviews.chandlerc.com/D2595
More information about the llvm-commits
mailing list