<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/71223>71223</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [CUDA][HIP] fails to compile __int128 division
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          yxsamliu
      </td>
    </tr>
</table>

<pre>
    Currently clang is able to lower `__int128` add/subtract/multiply operations in nvptx and amgpu. However, it lowers `__int128` division to compiler-rt lib call `__divti3`. Currently compiler-rt does not supports nvptx or amgpu target. Even if it does, amdgpu backend does not support ISA level linking, therefore is unable to link compiler-rt after LLVM codegen.

failure on amdgpu: https://godbolt.org/z/4oqPoYGG9

failure on nvptx: https://godbolt.org/z/411M3x4Eh

`__int128` division on x86_4 showing lowering to `__divti3` https://godbolt.org/z/b793fE7E5

`__int128` division with nvcc: https://godbolt.org/z/7WaM7vG9j

compiler-rt implementation of 128 bit integer division: https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/builtins/int_div_impl.inc

Ideally, nvptx and amdgpu backend should support ISA level linking and compiler-rt. However, that might take some time.

Another option is to let llvm lower `__int128` division to instructions instead of libcall (https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp#L4398). However, this may not worth the effort.

Another option is to implement `__divti3` as a inline function in the default clang header for CUDA/HIP. If `__int128` division is found in device code, mark it as used. This seems to be a feasible solution.

@Artem-B 



</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJykVU1v4zYQ_TX0ZbCCRPlLBx-8cZw1kAABdtuip4ASR9JsKFIlR3a8v76gnKR2usGm6EU2pOHMmzfvDVUI1FjElZh9FrPNRA3cOr86PgXVGRompdPH1dXgPVo2R6iMsg1QAFUaBHZg3AE9iHn68ECWM7kU8xSU1kJuw1CyVxULue0Gw9SbI7gevWJyNgBZsPuen0BZDapr-iGBL-6Ae_RCXgHxKXd4m1zTngI5G6tXruvJoP_kGQyVUCljTvGa9ky5mKcJnKE_C9cOA1jHEIa-d57DMxrnT2CAlW-QE7jeowWqI6B4JmJTnY4Rpaoe0ep_pYLd1zUY3KMBQ_aRbBMPcYsea-cx0jfYVwLJPl4AUzWjh9vb3--gchobtIlINyJdn561IjN4BGefYYh8DS1zH0S-FnIr5LZxunSGE-cbIbc_hNxO3V_37s-bm-KdTGPrH0qUZXf50_S6PU_03oCchafl_GEKoXUHss1poPEPuzdT-mXhclHk9fXievaRwgfiFuy-qj7S0uIPdbfY3xTfzzOfD4S63mCHlkfhgqshk0soiYEsY4P-tfBPyhG3Q5lUrhNya8z-5edT7913HL1RGldGiyiyQm7PCsdQip_KgQyTDUJuyXIk7SFiSshW55h3GpUxx6i1c2NdSDW0bjD6faGOZ84wXFiSW8XQUdMysHpECK5DYOrwQqBr66LUwfUjXRRGlSND7Pvn--Lc0mQD-6F62RGBUenIuaHy5G65_J8Uv8SM3F45jTcYX39Fg2PZzfpGyO0tNsrQD9ydZvzt2GNIqr4XMr-d5sVSyOINORSgU8dxERyc5zY6HrCunedfE_QqsrfGUAEUkDVkEerBVqczdkyusVaD4eet3KLS6KF2Hq5-20R2vuzuE9jV79JNAWo3WB3zadxThePKie10yj_GlacCDAF1At9ifwGxG-GWCApqVIHiGgvODBHXRZtimq49Y_fpM1y8Hp8Tvcp1kRdqgqtsXhTpNEvzYtKu1LTO5pnMMlzibLbIq2ldS5wVswWqNMd6QiuZyjzL0jwrZmm-TNI6m-c5Yr5cFstijmKaYqfIJHHO0ecTCmHA1SKTMp8YVaIJ420npcUDjB-FlPHy86tROOXQBDFNDQUO_2RhYjNekyO5s42Yff6yuxezDcRFGs6uI3gh-5XpyeDN6j_LdoQWXT9C_zsAAP__cvqGTA">