<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/80528>80528</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Missed optimization with inline assembly (chooses memory over register)
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          geometrian
      </td>
    </tr>
</table>

<pre>
    Consider the following simple example:
```cpp
//Compile with "clang -O3 -masm=intel"

#include <stdint.h>

union A_sRGB { struct { uint8_t a, r,g,b; }; uint32_t packed; };
union sRGB_A { struct { uint8_t r,g,b, a; }; uint32_t packed; };

A_sRGB rgba_to_argb( sRGB_A srgba ) {
        // R, G, B, A   ->   A, R, G, B
        A_sRGB ret;
        asm(
                "rorx   %0,%1, 24\n"
                : "=r"(ret.packed)
                : "rm"(srgba.packed)
                :
        );
        return ret;
}
```
As written, this will produce:
```x86asm
rgba_to_argb(sRGB_A):                # @rgba_to_argb(sRGB_A)
        mov     dword ptr [rsp - 4], edi
        rorx    eax, dword ptr [rsp - 4], 24
        ret
```
I.e., there is a pointless save / load of `edi` on the stack.

It comes down to the constraint on the input, `"rm"(srgba.packed)`. This says that either a register or a memory location is allowed. It seems that if the "m" is present, clang will always choose it over the register—even when it has to get the value *out* of the register and into memory to do it. This is a weird pessimization.

FWIW, GCC does not make this mistake, and compiles to the right answer:
```x86asm
rgba_to_argb(sRGB_A):
        rorx   eax,edi, 24
        ret
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJycVd9v4jgQ_mvMywgUHALJAw8UllUfTietTtpH5CTTxFfHjjwTaO-vP00gW1q1p9VVVUzsb35-nyeGyDYecauyB5UdZmbgNsRtg6FDjtb4WRnq1-0-eLI1RuAW4Sk4Fy7WN0C26x0CvhhZVbpTyUElO7VOrv9V39929FHp4z50vXUIF8stKK0rZ3wD8z9TmHeGOpUerGd0Suub1c02tb5yQ42g0j1xbT0vWpV-uwcN3gYPuxP9-P4AavMAxHGoePw5WM_5icEovYeo9L5Rel-qVIAHWQSQ6hNDb6pnrN9O7n2L59PuK99vbvUezO_7vj5vecemNCcOJxObUul8CkmyD0oXEnEyK64thR8S8Ls8HuSxA4C5Sr8BwE7e748nyyka8l0ahRCg8zf3EkHHEF8AQOksUXqvdLYUV3qlsr2_o2lEpzvhVKWHKIvOI_LiVrQuPkHG7ooby_sSeVdvcZ9uRB6if1_E5vBBf7fuElyiZUYvyXNrCS7WOehjqIfqE9m-5Gvpxrj5gZMrJWMu0up3f0qnoFbJlxajvwnchfO41pcQa-g5gsoeIvUwh5XKDpIp1va9zY0NQPMi5_9hqlcfLJE_bc3jAhfXpmBEsAQG-mA9OyQCMmcE0ZgLpobwBGqdSE7rBIIfJwGxqZ4X90p-ZKhChwR1uHjgMMKq4ImjsZ4nQ-v7gSWwZPK1FNbJAv4Svsi8EnBrGNBKrmAgYmOJMUKQtw67EF_Bhcqw3FYpRaYU1gt4ZCDE7ubAPo0ZKK0lqAD7iIR-TOc6kUZ1GHeRoFUbAiFYhnC-zb8psvqmVZ6oYoVn9HBp0QusNSR1N8gj-GzcINF2QQreSRvvfYDxNVjPYaqAA9QBLN8KHym5oBWikch29p-xwHdNP_58_Dne8_0e6oAEPjB05hmvYu8ssXnGcTL5WviRKUwTO9E2LYPxdMH4v-_Cp0K96lQk81uSnNXbtC7Swsxwu9wk6yLNN1kya7c61-m63OQbsyqWVbJOs7JYJpu1WSbpKi2zmd3qRK8SnaRJlhbLZCFv1UbX1TrTWVGWapVgZ6xbOHfuFiE2M0s04DZPMp3PnCnR0fQJjFsBzcuhIbVKnCWmNzO27HD7hyXCGkLPvwi5ftSsd9YjGCLsSvcKSudXAdHE76iiXwrSxWyIbtsy9yRdHGd6Y7kdykUVOqWPEvm2zPsY_saKlT6O2ZPSx7GAfwMAAP__cJhGKg">