<html>
    <head>
      <base href="https://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - nounroll attribute is sometimes lost after optimizations"
   href="https://llvm.org/bugs/show_bug.cgi?id=27974">27974</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>nounroll attribute is sometimes lost after optimizations
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Loop Optimizer
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>justin.lebar@gmail.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org, sanjoy@playingwithpointers.com
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>The NVPTX backend relies on the nounroll attribute not being stripped by
optimizations.  Its assembly printer detects nounroll loops and emits an asm
directive to ptxas telling it not to unroll loops, where appropriate.

The attached unoptimized IR has a few loops marked as nounroll.  After
optimization, most of the nounroll loops are still present; however, the
nounroll loops inside the /.*RowReduce.*/ functions are gone.

Unoptimized IR: <a href="https://gist.github.com/e6e8822a01dde1bb20195b4002d8efc3">https://gist.github.com/e6e8822a01dde1bb20195b4002d8efc3</a>
Optimized IR (opt -O3):
<a href="https://gist.github.com/d8fa9ec0295e4ae808a8150e776b6871">https://gist.github.com/d8fa9ec0295e4ae808a8150e776b6871</a>

Specifically, the function
_ZN5Eigen8internal12_GLOBAL__N_115RowReduceKernelILi32ELi256ELi128ENS_15TensorEvaluatorIKNS_9TensorMapINS_6TensorIfLi2ELi1EiEELi0EEENS_9GpuDeviceEEENS0_10PtrWrapperIfiEENS1_14CudaMaxReducerEEEvT4_T2_iiT3_
in the unoptimized IR has four branches with llvm.loop annotations:

  br label %121, !llvm.loop !57
  br label %158, !llvm.loop !58
  br label %104, !llvm.loop !59
  br label %197, !llvm.loop !60

The annotations are defined as

  !49 = !{!"llvm.loop.unroll.enable"}
  !55 = !{!"llvm.loop.unroll.disable"}
  !57 = distinct !{!57, !49}
  !58 = distinct !{!58, !49}
  !59 = distinct !{!59, !55}
  !60 = distinct !{!60, !49}

But after optimization, the same function has only one annotated branch:

  br i1 %182, label %178, label %.thread, !llvm.loop !59

  !55 = !{!"llvm.loop.unroll.enable"}
  !59 = distinct !{!59, !55}

(This is a loop which cannot be unrolled, because it contains volatile inline
asm.)

Based on the C++ source code, I am pretty sure that the backedge in the
optimized code that ought to have retained the llvm.loop.unroll.disable
attribute is

  br i1 %164, label %48, label %.thread.preheader.loopexit.</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>