<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/96838>96838</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Sinking load instructions results in worse performance and increased dynamic instruction counts
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          kazutakahirata
      </td>
    </tr>
</table>

<pre>
    SimplifyCFG recently gained https://github.com/llvm/llvm-project/commit/ede27d8d391e3917a5aa25be7903cabde4303a66 by @nikic.  Now, it seems to backfire in some cases:

Compile attached [bcmp.ll](https://github.com/user-attachments/files/16006666/bcmp.txt) (generated from `llvm-project/libc/src/string/bcmp.cpp`) like so:

```
$ clang -O3 -S bcmp.ll -o bcmp.s
```

Then I get:

```
Without the patch:

# %bb.26:
        movdqu -16(%rdi,%rdx), %xmm0
        movdqu  -16(%rsi,%rdx), %xmm1
        jmp .LBB1_34
:
:
:
# %bb.33:
        movdqu  (%rdi,%rdx), %xmm0
        movdqu  (%rsi,%rdx), %xmm1
.LBB1_34: # %.loopexit
```

```
With the patch:

# %bb.25:
        addq    %rdx, %rdi
        addq    $-16, %rdi
 addq    %rdx, %rsi
        addq    $-16, %rsi
        jmp .LBB1_33
:
:
:
# %bb.32:
        addq    %rdx, %rdi
 addq    %rdx, %rsi
.LBB1_33:                               # %.loopexit.sink.split
        movdqu  (%rdi), %xmm0
        movdqu (%rsi), %xmm1
```

Notice that the two load instructions sink just below the join point while the address calculation is left behind.  This seems to result in worse performance and increased dynamic instruction counts.

</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJykVcFu4zYQ_Rr6MrBAkZJsH3TYZOGiQLE9ZIEeixE5lhhTpJak4qRfX0h2vKnjpEkrGCRtzhu_xzcaYoymdUQ1K29Y-XWBY-p8qPf415hwj50JmHDReP1U35l-sGb3dLv9BQIpcsk-QYvGkYYupSEy-YWJLRPb1qRubDLleya21j48T8sh-HtSiYmt8n1vpgVpEiu91nKTk9zkKywRRdnQasOlwkZTIbnEqoLmCVjBndkblQF88wcmbsEkiER9hOShQbXfmUBgHETfEyiMNJPiXxk_jbe-H4wlwJRQdaSBlTeN6ofMWlZ-ZWL9jpIxUlgegT25FJnY7oylac4rzquqqpjYztnSY2JiA0ysW3IUMJGGXfA9sIpfnIQ1jWJiG8M8pmBc-5xFDQOr-JTImj1B9Bdips3j5_hVFKAsuhaWv0tY3sFJGCz9cRmvw-bxe0cOfoWW0vt_8odJnR8TpI5gwKS6y3AhgYmyaTJRnbfg9PT-Qf8YYZlXTKyZKIM2TNzOi0cmNpOjTJSPfc-v4l4A4xvA_J_A-36A7Lebm_xPWZz4nem-XpypS_kWdfhvxD_G-sxUfoETmcx6P9CjSe9Yd8Whj9hTvtKIWv-Y5meGt3CS-kZUMftxEXU9SfxQksuoF_bJz9gnPivtfdJnCvKc743n0rUsGrfP4mCfDXynnP61il4U0evauVoZ33wyiiB1eHxh08GD9ajBuJjCqJLxLsLEEe7HmKAh6w9z5L03DgZvXIJDNzXM6UfUOlCMoNCq0eKEBhPB0m6CdsbpDOB7Z-LPphwojjZNLfngQyQYKOx86NEpAnQTERUII2nQTw57o15SA-VHl2J2FLPQtdQbucEF1fkq3xRlwUW56OpCrjSVJcl8V-2Qyo0Q61Kj1uWOo0JcmFpwUfBKrLjgK15mXOZUrPJV0VRcaIms4NSjsdnUmzMf2oWJcaR6U63lemGxIRvnG1IIRweYN5kQ04UZ6rmfN2MbWcGtiSn-zJJMslTfGbc3rr1y8sfTif_veBZjsPWnL-BZw3R1HTU-1OLvAAAA__8e1Eyd">