<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - [SchedModel][MCA] Improve handling of load uOPs and read-advance."
   href="https://bugs.llvm.org/show_bug.cgi?id=51557">51557</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>[SchedModel][MCA] Improve handling of load uOPs and read-advance.
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>tools
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Windows NT
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>llvm-mca
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>andrea.dibiagio@gmail.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>andrea.dibiagio@gmail.com, llvm-bugs@lists.llvm.org, matthew.davis@sony.com
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Example:

```
vmulps 112(%rsp), %xmm14, %xmm14
vpermilps $85, %xmm14, %xmm14
```

<span class="quote">> llvm-mca -mcpu=skylake -iterations=2 -timeline</span >

```
Timeline view:
                    0123456789    
Index     0123456789          0123

[0,0]     DeeeeeeeeeeER  .    .  .   vmulps     112(%rsp), %xmm14, %xmm14
[0,1]     D==========eER .    .  .   vpermilps  $85, %xmm14, %xmm14
[1,0]     D==========eeeeeeeeeeER.   vmulps     112(%rsp), %xmm14, %xmm14
[1,1]     D====================eER   vpermilps  $85, %xmm14, %xmm14
```

However, the expected timeline looks like this:

```
Timeline view:
                    0123456789    
Index     0123456789          0123

[0,0]     DeeeeeeeeeeER  .    .  vmulps     112(%rsp), %xmm14, %xmm14
[0,1]     D==========eER .    .  vpermilps  $85, %xmm14, %xmm14
[1,0]     D=====eeeeeeeeeeER  .  vmulps     112(%rsp), %xmm14, %xmm14
[1,1]     D===============eER .  vpermilps  $85, %xmm14, %xmm14
```


The reason why mca doesn't schedule the second vmulps in advance, is because
the write-back cycle for register XMM14 is unknown until cycle 11.

One of the biggest limitations in LLVM, is the inability to independently
simulate individual micro-opcodes of an instruction.

For a simulator like mca, it means that memory uOPs cannot be accurately
tracked. This is the main reason why in general, instructions with memory
operands are often poorly simulated.

ReadAdvance was originally introduced to workaround the issue related to the
inability of processing individual uOPs of an instruction. However, in order to
work, read-advance still requires that the write-back cycle for the input
register definition is known.

In this particular example, the write-back stage for the first VPERMILPS is
unknown until cycle 11. Therefore, the write-back of XMM14 is also unknown
until then. So, the read-advance in VMULPS can only trigger at that point.

That is what prevents the VMULPS from starting earlier.

There might be ways to partially work-around this issue in mca. However, I am
afraid that a proper solution would require introducing changes to the
scheduling model, and how read-advance for memory load operands is defined.

Depending on how we decide to address this issue, this bug could potentially
have an impact on <a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - [llvm-mca] LSUnit: Consider using field `LoadLatency` from MCSchedModel to simulate the latency of load instructions."
   href="show_bug.cgi?id=39829">bug 39829</a> and <a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - [llvm-mca] Investigate how to improve the load/store queue usage simulation in LSUnit."
   href="show_bug.cgi?id=39830">bug 39830</a>.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>