<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - A loop is not unrolled with a decreasing counter and -fno-vectorize"
   href="https://bugs.llvm.org/show_bug.cgi?id=46924">46924</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>A loop is not unrolled with a decreasing counter and -fno-vectorize
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Loop Optimizer
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>fj8765ah@aa.jp.fujitsu.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>If the following program is compiled with -fno-vectorize, unrolling does not
work.
It seems to be stopped by "High cost for expanding trip count scale!".
I have a feeling that unrolling can be applied. What do you think?


The loop was unrolled in llvm 9. Starting with llvm 10, it seems that loops are
no longer unrolled.

It seems to be affected by the following commit.
commit 0f22e783a038b6983f0fe161eef6cf2add3a4156
[InstCombine] Revert rL341831: relax one-use check in foldICmpAddConstant()
(PR44100) 

For AArch64, this problem does not occur unless the -mcpu option is specified,
for example, thunderx2t99.

- minus.c

void foo(double * restrict a,
         double * restrict b,
         double * restrict c,
         int n) {

  for (int i=n;i>0;--i)
    c[i] = a[i] + b[i];

  return;
}


$ clang -target x86_64-unkown-linux-gnu -O3 minus.c
-Rpass=loop-vectorize\|unroll -fno-vectorize -S


- information on --debug-only=loop-unroll

$ clang -target x86_64-unkown-linux-gnu -O3 minus.c
-Rpass=loop-vectorize\|unroll -fno-vectorize -S -mllvm --debug-only=loop-unroll
Loop Unroll: F[foo] Loop %for.body
  Loop Size = 8
  will not try to unroll loop with runtime trip count -unroll-runtime not given
Loop Unroll: F[foo] Loop %for.body
  Loop Size = 8
  runtime unrolling with count: 4
  Exiting Block = for.body
Trying runtime unrolling on Loop:
Loop at depth 1 containing: %for.body<header><latch><exiting> Using prolog
remainder.
High cost for expanding trip count scev!
Won't unroll; remainder loop could not be generated when assuming runtime trip
count


On the other hand, in the case of a loop with increasing counter, unrolling
works.

- plus.c

void foo(double * restrict a,
         double * restrict b,
         double * restrict c,
         int n) {

  for (int i=0;i<n;++i)
    c[i] = a[i] + b[i];

  return;
}


$ clang -target x86_64-unkown-linux-gnu -O3 plus.c
-Rpass=loop-vectorize\|unroll -fno-vectorize -S
plus.c:6:3: remark: unrolled loop by a factor of 4 with run-time trip count
[-Rpass=loop-unroll]
  for (int i=0;i<n;++i)
  ^
$


Also, changing the loop control variable type from int to unsigned will cause
unrolling to work.


- minusUnsigned.c

void foo(double * restrict a,
         double * restrict b,
         double * restrict c,
         int n) {

  for (unsigned i=n;i>0;--i)
    c[i] = a[i] + b[i];

  return;
}

$ clang -target x86_64-unkown-linux-gnu -O3 minusUnsigned.c
-Rpass=loop-vectorize\|unroll -fno-vectorize -S
minusUnsigned.c:6:3: remark: unrolled loop by a factor of 4 with run-time trip
count [-Rpass=loop-unroll]
  for (unsigned i=n;i>0;--i)
  ^
$</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>