<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/58415>58415</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Missed SSE optimizations: properties of math functions and more
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          geometrian
      </td>
    </tr>
</table>

<pre>
    Consider the following (compile with `-std=c++2b -O3 -fno-math-errno`; all tests run with Clang 16.0.0 [from trunk](https://github.com/llvm/llvm-project/tree/5ea3155565fb724966d0e38c09ff7b784225018f)):
```cpp
float test(float radicand,float sign) {
        float radical = std::sqrt(radicand);
        return std::copysign(radical,sign);
}
```
This produces, in substance:
```asm
sqrtss  xmm0, xmm0
andps   xmm1, xmmword ptr [rip + .LCPI0_0]   #0x80000000, signbit mask
andps   xmm0, xmmword ptr [rip + .LCPI0_1]   #0x7fffffff, rest-of-float mask
orps    xmm0, xmm1
```
As we know, the real-valued square-root requires, and produces, a non-negative number. And yet, the result is still `andps`ed to remove the sign bit. This appears to be pointless.

---

Now, for negative numbers, `sqrtss` returns a NaN—in fact, a -NaN (empirically: negative numbers produce 0xFFC00000, regardless of magnitude, NaN arguments and `-0.0f` are passed through unchanged). Perhaps Clang is trying to preserve such behavior?

We can try to add assume statements in the likely event we don't care. However, none of e.g. the following appear to have any effect:
```cpp
//Note also ensures not NaN
//Note using `>` instead of `>=` to demonstrate not being tripped up by -0
__builtin_assume(radicand>0.0f);
```
```cpp
__builtin_assume(radical>0.0f); //(similar note)
```
```cpp
__builtin_assume(!std::isnan(radical));
```
The following don't work either (while also producing an erroneous `-Wassume` warning `the argument to '__builtin_assume' has side effects that will be discarded [-Wassume]`; that's probably something that should be looked at):
```cpp
__builtin_assume( (std::bit_cast<unsigned>(radical)&0x7FFFFFFFu) == std::bit_cast<unsigned>(radical) );
```
```cpp
__builtin_assume( (std::bit_cast<unsigned>(radical)&0x80000000u) == 0u );
```
In fact, all of these together still have no effect.

---

The impact in this particular case might appear low—a single useless bitop, adding a cycle if they can't execute superscalar (e.g. if the other `andps` can be moved up) and the cost of storing the mask.

The reason I'm reporting this is that I think this *class* of missed opportunity is likely to exist widespread throughout the implementation in analogous cases. Many functions in the math and broader standard library assume bounded inputs, and probably even more produce bounded outputs. The missed opportunity, considered with all the nonlinear ways small optimizations can exceed the effect of their individual contributions when working together, is probably quite significant, especially for math-heavy code (such as I am interested in).

The language-anointed solution to such will likely be contracts, but those don't exist yet and it's not like language design changes what math is. At the end of the day, the compiler changes are the same anyway, enforcing contracts on math functions and pumping that information into the optimizer.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJylV9tu4zgS_Rr5hbAhyZbtPPghlw42wEzPADPAPAaUVLK4kUQ1L3E8Xz-nSPmSpKd3dzYw2i2RLFadOnWqXOr6uLvXg1U1GeFaEo3uOn1Qw14k-bbS_ag6EgflWpGs07l1dbJ8qJL8Dp-8FPNflmLeDHreS9fOyZhBY1uyvBOy64Qj66wwfogG7jsJs9l6kS5SkRR3jdG9cFh-SYoH3NY6N9pkeZvkj_jsccSXC7iAh657PX3NR6P_TZXDozNE-CpILrOiKNZFU27y1c16Xae03FbpTdNsys12ledFmm2bJL_hDy5IH5L0lh0Nn2oc45um09IFr-FNfDCyVpUc6iS_jy-s2g-wIpLN3WQG11xt7QQAEgGnW3zsN8PGLmZw_-WgIefNcNld6fEYL5hOdLh3uvFybvPwIYD4-HurrAA4ta_I4pxQsOxL6-RQ0eeope3jG3bRWiHe-j7lY-E7rMDhEQv8JptWDtrUYnSG82fUCJLcicVP978-pc8psojNSb5M37Zp_ONT7H-pnOilfflkN_3PdrMru5sm_vEpgzzNdTOP6F-saxOMX1vPvgvYrRUHEi-DPvA2Jr8h2c1fZeepFvabl4bmRmuklr55ZSKqcP4dylIMepgPtJdOvZIYfF-SWYhbbDuSu1i2vnMCGbJOoTQ4AYwCvnGX09jQaxznvQyYAGILEVIqx5GksbypJDFqNbiOrF2cWMT_zufz68evMaRGG_HBseAyLo1Jx39E5CCuEV_l1-RLnmzT5GYF7jSycjHAOVZYDghqYJiV3RF8-mT6BItI3x4f78_5N9hmanZZ6AZ52g_K-Zp4ie1Ks_c9DRAKRpZFBvLQsGeAX4zSWgaoNdrvW-GHqoWIEBfSQvxKppVIdhQWQOXMkZULSI0AnAx8s75qgVsrX5U2yfLxGqY_SKAo-RQfkXUtcBucQY6ko-gUgOCcdOqFuqOgV7xk1tQaNblxOG9oIf6lD1gxHBLIQBwnLfaLD4IaE8lXwRtCuLDXNCxlfy9JUQu_aof9ndWCBusRGq5xIV8fN3kbpJs1-AtjqAbrSNbs0fRy-cDv4UQNxmHVINRgrqSAHapvBOJ-FOVRzKdSeX4uveqcGp4jQteStvwSEnatUO_r7HNUf2uue2dNxMCwalWvOmAHP4nX_vE1SZ6d1VbZQb6T2psfBPH7u1Se0g_ZehGEToXuCUuHlrtlSFSshZD2QaAxghba28DvPyZvkIaDNMOUMObKqRY4PbD_2f8NqAMJQbeeqAPOt1C_A2sK1KFWFoyskT_o6Pmi4mFqyrwXRkKllrIEoa3uybUh8WzHttp3NVvqtH6BGd7_o5b5HYgZiDPGkLHnSqKhLu_9wMJGTJgPoK8h7I_xz4feCo5eN9H_woj4f_n3j7w-tblrr1P_I1-ernQVKUNZIu8Wuq_3FFgU20MQiEFPSf6R1DMvVT_CZpQqngGkcaryXC8IgkSv9q07qQ8IfBZ5KVgtOlYNCvqMuPUYfKvrQF1RHSusq-DmkdUy0J7eqPKOxXWE7gMQGegfNC_uFTrWxKXNBakFsbjNsbwwZKz4vLnS1jEW1mkTuUihoS8-RooGbfUgnuBFj4dRI9KwH2GrqRae-HF4iS-T_LbqkGR8h-6jQjvRI5_0aERHPjaJO4qO3pTlaqrJooPIc-PR3gWngHQXGgMaH_wA4nKQnd5zaTPWdiF-Zl1v0Kh4x7l_8Hwcwi2NlnXIM55Qqri8NBIdaGo9pfYD168aRu-uB45Yr9yAACF3xqnZng7ART7BQwN9J062VE2DPlbCQB5m9JaJNnRqYHYc5BHy0gdqjg6i-6eMYXDy6K0iigmLvJzoqwy8rdWrqj3GX1yCHlL6eO7Qwl9WydiXI8nDcHolQhiuXBx7VMM9JZQHMkCV4lkjzDHhB0ZL8hUs1JA_Lldu7ZDDJyF7eOCIJ8IAHU8HH6nDM4KXe5rLgUconvB0F7zkxAdbQUUnMpQUI0FhhSyUgQHaXlp_5ApmvJAhFYWVGylbOF-HNhvmuTi3MCBhWAX6Crm6jbSioZ6wFLU8nmbG6deXOZ_lgSgMiLIP48Mh7qUBAIVWc_ZYIKpwyYWIgUYeFk9ar_hYfyIyMAhlG7OOCXZGu2y9Xqd5uiqKWb1b1jfLGzlzCtPn7ufIr99--_KeJzwWIq1QBadomvc-ecH0nXnT7f7nH3y41fPU_VhsV1kxa3dFuSXKmmW1LGhTlasiq4q8yLJltq2yVNKskyV1doduiC44U7s8zfMszTZZkW-Wq0WaF3lWb4u1zDfpak3JKqVeqm7BFy-02c_MLvhQ-r3FYoeU28siapabA53sSw-KmN2euKsaJYdZ8HgX3P0L59X-BA">