<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/100383>100383</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            `int64` modulo constant (`x % 3`) and divide-by-constant (`'x / 3`) compile to 80 instructions.
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            backend:AMDGPU,
            llvm:codegen,
            performance
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          bjacob
      </td>
    </tr>
</table>

<pre>
    This is observed with `-xhip` targeting AMD MI300 (`gfx942`).

Compiler Explorer link: https://godbolt.org/z/xrfhhaaeY. For completeness, the clang flags are `-O3 --cuda-device-only -x hip  -nogpuinc -nogpulib --offload-arch=gfx942`.

Testcase:

```c++
__attribute__((device))
int64_t a(int64_t i) {
    return i % 3;
}
```

This compiles to 80 instructions.

By contrast, the same testcase with `int64_t` replaced by `int32_t` compiles to just 8 instructions.

I was expecting the `int64` variant to generate slightly over 2x more instructions than the `int32` variant (since the target requires rewriting `int64` ops into pairs of `int32` ops). Not 10x.

The above Compiler Explorer link shows the same happening with `i / 3` instead of `i % 3`.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJx0VEGP4ygT_TWVS8kRDYljH3xId758mkPv7mH2sKcWxmWbHgJewGlnf_0Kx51JWrMSShCU6716vCoZgu4sUQXbZ9geVnKMvfNV_S6Vq1e1ay7V914H1AFdHcifqcEPHXuEnGVTrwfIGUbpO4radrh_PeDrN8EYAi8gZ107lRsOOQNeroEdgO2vvy_uNGhDHv83DcZ58mi0_QFij32MQwCxB34EfuxcUzsT1853wI__AD9Ovu17KemvNR6dR-VOg6FIlkIA_oKxJ1RG2g5bI7uA0tPM9XeBWabGRmYNnbWizFlzwWzCXg-ImXXdMGqrlp3RNWaZa1vjZJNJr3oQh1sxD5V8pxCVDJQo3x2nmuelgD-nNZ--vckYva7HSG9vSSJeXOkAL9Oag7SN-eYtogRefO418BJht6RBRPQUR29RI_AtChDLFewOXwg8kE1vqa7aB4wOC4bahuhHFbWz4aGy5wsqZ6OXIX4qG-SJMC4V34ywkExW8DQYqajB-rLcCH69uUd9H0PE4r-Rv-GHDEjTQGq2VYL-xEm5ztJraWNK1ZElLyNhMLrro7mgO5NHPuHJeXqAwNhLe5dL8PtcwIugraI54Opo9PT3qD0F9PTh9UzlnoYbAmobHQ5S-4CufUjshpBcj7-5iE9sejRNTyhrdyb8dSNg6N1H-Cl5L4eBbMK_aY7AjygSUKqRZPOJvxgiZ-tVU4mmFKVcUfW0408FL0tRrPqqLTgv8k2z24kdy_PNTuRKKrllqpDbmvKVrjjjG7bjm6cN24p8XVLdFqrd5Nst5aJUsGF0ktqsjTmfUneudAgjVU-MiUKsjKzJhHmocF5L9YNsA2K_fz38_48_gXPgL8B5-hbEXrmGOrK344F86_xJ2tQWPE0lX6XQrB67ABtmdIjhJ3DU0VB1_y4n14zGJfOGuDwt5Gy6CZN6SdoGG33WDWX1JfsSCnw33fRN0Yt5f9Uxq9Gb6svQ0rEf67VyJ-DHucjrXzZ4904qAj_OagXgx0Wwc8X_DQAA__9NKraV">