<html>

    <head>

      <base href="https://bugs.llvm.org/">

    </head>

    <body><span class="vcard"><a class="email" href="mailto:martin@martin.st" title="Martin Storsjö <martin@martin.st>"> <span class="fn">Martin Storsjö</span></a>

</span> changed

          <a class="bz_bug_link 

          bz_status_RESOLVED  bz_closed"

   title="RESOLVED INVALID - llvm-mca for cortex-a57 gets thrown off by SIMD loads with dependencies (negative latency?)"

   href="https://bugs.llvm.org/show_bug.cgi?id=49499">bug 49499</a>

          <br>

             <table border="1" cellspacing="0" cellpadding="8">

          <tr>

            <th>What</th>

            <th>Removed</th>

            <th>Added</th>

          </tr>


         <tr>

           <td style="text-align:right;">Resolution</td>

           <td>---

           </td>

           <td>INVALID

           </td>

         </tr>


         <tr>

           <td style="text-align:right;">Status</td>

           <td>NEW

           </td>

           <td>RESOLVED

           </td>

         </tr></table>

      <p>

        <div>

            <b><a class="bz_bug_link 

          bz_status_RESOLVED  bz_closed"

   title="RESOLVED INVALID - llvm-mca for cortex-a57 gets thrown off by SIMD loads with dependencies (negative latency?)"

   href="https://bugs.llvm.org/show_bug.cgi?id=49499#c2">Comment # 2</a>

              on <a class="bz_bug_link 

          bz_status_RESOLVED  bz_closed"

   title="RESOLVED INVALID - llvm-mca for cortex-a57 gets thrown off by SIMD loads with dependencies (negative latency?)"

   href="https://bugs.llvm.org/show_bug.cgi?id=49499">bug 49499</a>

              from <span class="vcard"><a class="email" href="mailto:martin@martin.st" title="Martin Storsjö <martin@martin.st>"> <span class="fn">Martin Storsjö</span></a>

</span></b>

        <pre>(In reply to Andrea Di Biagio from <a href="show_bug.cgi?id=49499#c1">comment #1</a>)

<span class="quote">> tl;dr: there is no negative latency. WAW dependencies are effectively broken

> by the register renamer.

> 

> Cortex-a57 is an out-of-order processor. The default llvm-mca pipeline for

> out-of-order processors assumes the presence of a register renamer.

> 

> It means that false dependencies are effectively broken by the register

> renamer at the cost of consuming a physical register.

> 

> As far as I undestand, each ADD has a latency of 3cy. Also, ADD instructions

> are in a dependency chain. When simulating multiple iterations, there is an

> implicit loop carried dependency (i.e the first ADD of an iteration must

> wait for the result from the last ADD of a previous iteration). That's why

> latency converges to 900cy for the first experiment.

> 

> In the second experiment, you have inserted a load which writes the same

> registers defined by the following ADD instructions.

> The LD1 introduces new definitions for v0.16b, v1.16b, v2.16b, v3.16b.

> There is a WAW dependency on each of those registers. In the absence of

> register renaming, that load would need to wait until those registers are

> written. In practice however, the register renamer "renames" breaks those

> dependencies, so the LOAD doesn't need to wait on those definitions.

> 

> The throughput of LD1 is still limited (roughly one LD1 every 4

> instructions). Therefore, every 4 cycles, the first ADD of a new iteration

> can start execution. That's how you end up with that low number of cycles.

> 

> The last example is just like the first, with the extra LD1. The LD1 is

> independent from the other instructions, so it can always execute as soon as

> the units are available.

> 

> NOTE: by default, llvm-mca assumes that register renaming is always

> successful (i.e. as if there is an unbounded number of phys registers

> available for renaming). Renaming can be limited by introducing a (optional)

> `RegisterFile` definition in the scheduling model. For an example of

> `RegisterFile`, see the definition of `JIntegerPRF` in

> X86/X86ScheduleBtver2.td.</span >


Thanks for the thorough explanation! That does indeed explain it, and by

setting e.g. `--iterations 1`, I also see numbers that match up better with my

expectations.</pre>

        </div>

      </p>


      <hr>

      <span>You are receiving this mail because:</span>


      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>