<html>

    <head>

      <base href="https://bugs.llvm.org/">

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW - [MC][llvm-mca] Teach how to identify instructions that a false dependency on the destination register."

   href="https://bugs.llvm.org/show_bug.cgi?id=38813">38813</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>[MC][llvm-mca] Teach how to identify instructions that a false dependency on the destination register.

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>new-bugs

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>unspecified

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>Windows NT

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>enhancement

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>new bugs

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>andrea.dibiagio@gmail.com

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>llvm-bugs@lists.llvm.org

          </td>

        </tr></table>

      <p>

        <div>

        <pre>Some instructions have a false dependency on the destination register.

For example, on some Intel CPUs, there is a false dependency on the

LZCNT/TZCNT/POPCNT destination register. See also <a class="bz_bug_link 

          bz_status_REOPENED "

   title="REOPENED - Clang is not aware of a false dependency of LZCNT, TZCNT, POPCNT on destination register on some Intel CPUs"

   href="show_bug.cgi?id=33869">bug 33869</a>.

That false dependency can be broken using a dep-breaking zero idiom.

On BtVer2, there is a similar issue with general purpose zero/sign extending

MOV instructions.

For example:

   movzbl %al, %esi

   movzbl %al, %esi

   movzbl %al, %esi

   movzbl %al, %esi

`perf stat` reports a throughput of 1.00 instructions per cycle, even if,

ideally, a movz could be issued to one of two pipelines, and the cpu can

dispatch two COPs per cycle.

Same for movzwl:

   movzwl %al, %esi

   movzwl %al, %esi

   movzwl %al, %esi

   movzwl %al, %esi

`perf stat` still reports 1.00 IPC.

If we instead test this:

   movzwl %al, %esi

   movzwl %al, %ecx

   movzwl %al, %edx

   movzwl %al, %ebx

Then the throughput is 2.00 IPC (as expected).

Same issue can be found with sign-extending GPR moves.

--

In the X86 backend, we currently use special feature flags to mark Intel

processors that have a false dependency on LZCNT/TZCNT/POPCNT. That information

is then used to bias the result of X86InstrInfo::hasPartialRegUpdate() queries

(<a href="https://reviews.llvm.org/D40334">https://reviews.llvm.org/D40334</a>).

The goal of this bug is to teach llvm-mca about the existence of instructions

that have a false dependency on their output register. Ideally, we would like

to have a general framework for doing queries on the subtarget.

Rather than having a target feature flag for every problematic instruction, we

should expose knowledge about instructions that are subject to that extra false

dependency using target independent hooks (which are then redefined in override

by each target that want to change their semantic).

Ideally, knowledge about instructions with a false dependency could be

automatically generated via tablegen (similarly to how for example we generate

information for variant scheduling classes in the scheduling models). So that

each subtarget/processor model may specify the set of "problematic"

instructions.

Tablegen backends would then generate useful TII/STI hook overrides for us.

This would make that knowledge accessible through a target independent

interface from any codegen pass in the backend.

This would not only help llvm-mca, but it would probably help simplifying some

code (at least in the X86 backend) which currently heavily relies on the

presecne of target feature flags, and target specific functions.

This same framework could be used to expose partial write stalls caused by

false dependencies on the output register in the presence of SSE1/SSE2

sqrt/rsqrt/rcp instructions. See also the discussion on

<a href="https://reviews.llvm.org/D51542">https://reviews.llvm.org/D51542</a>.</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>