[cfe-dev] Loop optimised into div (?)

Sun Oct 27 19:38:38 PDT 2019

On Wed, Oct 23, 2019 at 10:29 AM Joan Lluch via cfe-dev
<cfe-dev at lists.llvm.org> wrote:
>
> Hi all,
>
> For the following code
>
> int countHundred( int num )
> {
>   int count = 0;
>   while ( num >= 100) { count++ ; num = num - 100; }
>   return count;
> }
>
> the loop shown above is optimised away and replaced by ‘udiv’ instruction.
>
> This is undesirable for targets requiring libcalls for integer divisions, such as the MSP430.

That's not necessarily the case, unless you know in advance that "num"
is a relatively small positive number.

Even on a 16 bit machine, large positive numbers could have loop trip
counts up to 327 which is going to be much slower than a library call
using even a simple shift-and-subtract division algorithm.

I hope the udiv isn't used until after an initial test to make sure
"num" is >100 or at least positive, otherwise using udiv is an actual
bug.

I'd suggest doing some testing to find out at which point a library
call is actually faster than this loop (probably anything above 2000
or 4000 or so), modifying the compiler to not do the transform if the
trip count is *known* to be less than this (i.e. continue to use the
library routine if the trip count is unknown), and also rewrite the
code to:

#define ITER_LIMIT 24 // for example -- determine by experiment and
modify compiler appropriately
int countHundred( int num )
{
  if (num > (ITER_LIMIT * 100)) return num / 100;
  int count = 0;
  while ( num >= 100) { count++ ; num = num - 100; }
  return count;
}