<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/90985>90985</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [X86] Worse runtime performance on Zen CPU when optimizing for Zen
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          Systemcluster
      </td>
    </tr>
</table>

<pre>
    The following code compiled with `-O3 -march=znver4` (or any other `znver`) runs around 25% slower on Zen hardware than when compiled with `-O3 -march=x86-64-v4` or the baseline `x86-64`.

```c
bool check_prime(int64_t n) {
    if (n < 2) {
        return true;
    }
    int64_t lim = (int64_t)ceil((double)n / 2.0);
    for (int64_t i = 2; i < lim; i++) {
        if (n % i == 0) {
            return false;
        }
 }
    return true;
}
```

<details>
<summary>Full code</summary>

```c
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <math.h>
#include <time.h>

bool check_prime(int64_t n) {
    if (n < 2) {
 return true;
    }
    int64_t lim = (int64_t)ceil((double)n / 2.0);
    for (int64_t i = 2; i < lim; i++) {
        if (n % i == 0) {
            return false;
        }
    }
    return true;
}

int main() {
    clock_t now = clock();
    int sum = 0;
    for (int i = 0; i < 1000000; i++) {
        if (check_prime(i)) {
            sum += 1;
        }
    }
    printf("%f, %d\n", (double)(clock() - now) / CLOCKS_PER_SEC, sum);
    return 0;
}
```

</details>

Running on a Ryzen 7950X:

```cmd
> clang.exe -std=c11 -O3 -march=znver4 ./src/perf.c && ./a.exe
24.225000 seconds, 78501

> clang.exe -std=c11 -O3 -march=x86-64-v4 ./src/perf.c && ./a.exe
20.866000 seconds, 78501

> clang.exe -std=c11 -O3 ./src/perf.c && ./a.exe                  
20.819000 seconds, 78501
```

```cmd
> clang.exe --version
clang version 18.1.4
Target: x86_64-pc-windows-msvc
Thread model: posix
InstalledDir: C:\Program Files\LLVM\bin
```

Disassembly here: https://godbolt.org/z/orssnKP74

I originally noticed the issue with Rust: https://godbolt.org/z/Kh1v3G74K
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzsVk1v2zgT_jXjy8ACRX1YOvjQyNGLIn3RIO3uFr0ElERb3FKkQVJxnF-_IOXGcep8LHaPKwiyyYecZ56ZATnMWrFRnC8hu4BsNWOj67VZftlbx4dWjtZxM2t0t19-7TmutZR6J9QGW91xbPWwFZJ3uBOuR8jJ_HOC84GZtodk9aDuuEkhJwi00AaZ2qN2PTd-ZQAhJ0BLNKOyyIweVYc0A5qhlXrHDWqF37nCnpluxwxH1zOFu56rN5jvi3yep_O7QK4Nup5jwyyXQnG_eMIhJxGQFZAPh29Opredxo3WEtuetz9ut0YMHGghlMvTW4fKuw2Li2khIqJYe5UKIamQ_oL6x3A3GoXOjBySJxgsVk_MHAikGBCSFR45gZYtFxJoAbTo9NhIDrRUCLRGGvk4nlhda_NkM4pgjUJyEf5WniAMgF6E94zDj5JoNu33JsjZpU_0rZm0pwJPRZ7IPReTxwWP-ThJUlJ13DEhLSSXj1N2HAZm9pBc1qOUoTghqYDWR-C1TANNhGrl2HEfG-s6n_qoP-56jgvlXoX1i-jAXP8i6MTAn4L_ahn-V39nFb9eguErlMOBCRW0P2NvpW5_-FToXZAYxtPCEye8DTtOQSUvROoQJXKMUkzC875IPasR78FLsQqe0AvPFv-NWG2NUG4d1FGg2Rpo5bPTQVapMOWHx-LwPh3DgXMfpOASrbH69Lm6-nJ7fXlz--Wy8jvtODwP2iE35P1nA9D6l-MhfG9Gpfy9pRUyvNk_cIWLMiPfIPlw_mQYup9GL7GVTG0ifs9xbl0HyaqNYzxz12HkjxzTAq233KyjFoHmQPMwz7yBySZNI0ozQgha3mrVWa9_UWQkPpXzHubHu-7d5CQq8vwfkb_JhL88R-64fIX7bFrfysn8jhsrtJrAAOBhCuMiiqN0Qr4ys-EOkg94X-S3eTrftvOdUJ3e2flg7w6XwdfecNbhoDsu_dqttuJ-gj4q65iUvFsJ46HKV09WXRu9MWzAWkhuIas-ffr9_5BVjVCvqFoJy6zlQyP32HPDvb3eua31NmkNtN7ortHSRdpsgNYPQGttrFVX14v0qaGPqI3YCMWk3KPSTrS8Cx2PsHbkU4N0M1r3HoKrPr5L_rdIr2bdMunKpGQzvowXcbrI8izOZ_2SLEqaF-uSNR0rkyJLkpSwNeNxkXQNWTQzsaSEpiQjSbyIsziLUrJI6Doty5Qk3ZomkBI-MCEjKe8Gzz0Lji5LUhbZTLKGSxu6UUoV300q_OGSrWZm6ffMm3FjISVSWGePVpxwMrSx34ocshX-oY3lvrf0tyr6QtVmYKrlP9vK6vq3qZnUWycG8eCPB38Yf-dqNhq5fBYs4fqxiVo9AK096eFnvjX6T946oHVw1QKtg5S_AgAA__8RkBWQ">