<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - [AArch64][opt] Induction variable pass missed sext elimination opportunity"
   href="https://bugs.llvm.org/show_bug.cgi?id=33481">33481</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>[AArch64][opt] Induction variable pass missed sext elimination opportunity
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>tools
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>4.0
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>opt
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>john.russo@nxp.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Created <span class=""><a href="attachment.cgi?id=18650" name="attach_18650" title="C test file">attachment 18650</a> <a href="attachment.cgi?id=18650&action=edit" title="C test file">[details]</a></span>
C test file

LLVM/CLANG tag "release_40", target AArch64 (and any 64-bit target whose int
size is 32 and pointer size
is 64).

Given the attached test case, use the following script or compile with:

OPT=-O3
INFO="-mllvm -debug"
INFO2="-mllvm -debug-only=instcombine"
INFO3="-mllvm -print-after-all"
INFO4="-mllvm -debug-only=indvars"
NOINFO=
CLANG_PATH=<your-path>/bin
${CLANG_PATH}/clang -v -target aarch64 -S -emit-llvm -fshort-wchar ${NOINFO}
${OPT}  -std=c99 test.c 

Note, the TakeAddr function is added to disable DeadArgElimination, which
oversimplifies the test case.

Is this an optimization opportunity in Induction Variable simplification?


Could the sequence:

Function PadInputTemplate:

*** IR Dump After Induction Variable Simplification ***
for.body.us:                                      ; preds =
%for.body.us.preheader, %for.cond14.for.cond.cleanup16_crit_edge.us
  %indvars.iv = phi i64 [ 0, %for.body.us.preheader ], [ %indvars.iv.next,
%for.cond14.for.cond.cleanup16_crit_edge.us ]
  %3 = trunc i64 %indvars.iv to i32
  %mul4.us = mul i32 %mul3, %3
  %idx.ext.us = sext i32 %mul4.us to i64
  %add.ptr.us = getelementptr inbounds float, float* %input, i64 %idx.ext.us

be simplified to: (%mul3 has been constant propagated to 961 by inlining)

Function PadInput_96_27_27_2 (after inlining):

for.body.us.i:                                    ; preds =
%for.cond14.for.cond.cleanup16_crit_edge.us.i, %entry
  %indvars.iv.i = phi i64 [ 0, %entry ], [ %indvars.iv.next.i,
%for.cond14.for.cond.cleanup16_crit_edge.us.i ]
  %no_sext = mul nuw nsw i64 %indvars.iv.i, 961 
  %add.ptr8.us.i = getelementptr inbounds float, float* %output, i64 %no_sext

Since the sext is not eliminated in the current case, this eventually leads to
shl,ashr instructions which are combined with the mul leading to inefficient
code.

The AArch64 backend may efficiently code gen this, but the issue is generic to
other target whose int size is 32 and pointer size is 64.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>