[CORRECTION] [compiler-rt] _udivdi3(), _umoddi3(), _moddi3() and _divdi3() routines not properly "tuned"
Stefan Kanthak via llvm-commits
llvm-commits at lists.llvm.org
Thu Nov 9 01:33:43 PST 2017
Hi,
replace the proposed fix #3 from my initial post with the following
faster version:
> Bug #3: such "highly tuned" routines should but come without large
> ~~~~~~~ duplicate code sequences.
>
> In the 4 routines named in the subject, the code from label 1: to the
> respective return is almost identical to the code preceeding label 1:;
> the only difference is the initial subtraction and the insertion of
> a leading 1 into the quotient.
>
> Fix #3: remove all lines between "jae 1f" (including the wrong
> ~~~~~~~ comment which follows "jae 1f") and the label 1:, then
> apply the following diff (yes, this adds one or two
> instructions to the overall execution path, but should
> typically cost no cycles, since they can execute in parallel).
+ pushl %edi
+ xorl %edi, %edi // MSB of quotient
cmpl %ebx, %edx // to avoid overflowing the upcoming divide.
+ jb 0f
- jae 1f
1: /* High word of a is greater than or equal to (b >> (1 + i)) on this branch */
+ movl $0x80000000, %edi // MSB of quotient
subl %ebx, %edx // subtract bhi from ahi so that divide will not
+
+0: /* High word of a is smaller than (b >> (1 + i)) on this branch */
+
divl %ebx // overflow, and find q and r such that
//
// ahi:alo = (1:q)*bhi + r
//
// Note that q is a number in (31-i).(1+i)
// fix point.
- pushl %edi
notl %ecx
shrl %eax
+ orl %eax // insert proper MSB into quotient
- orl $0x80000000, %eax
regards
Stefan Kanthak
More information about the llvm-commits
mailing list