<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/67781>67781</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            AMDGPU: support folding 64-bit immediates
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            enhancement,
            backend:AMDGPU
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          jayfoad
      </td>
    </tr>
</table>

<pre>
    SIFoldOperands should support folding 64-bit immediates into literal operands.

Before GFX10 this was rarely a problem because most 64-bit instructions are VOP3 only and VOP3 did not allow a literal operand. One exception is V_FMAC_F64 on gfx90a+. Here's an example where we generate `v_fmac_f64_e32 v[0:1], s[4:5], v[2:3]` which could be folded to `v_fmac_f64_e32 v[0:1], 0x41200000, v[2:3]`: https://godbolt.org/z/EMTjfj6jb

On GFX10+ VOP3 can take a literal operand, so there are lots of 64-bit instructions that could be folded.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJyMU0FvtDYQ_TXmMgoyNrDhwIHNlrSHaCO1jXpbDXhYvDE2sk12019fsbtqqzRSPw5YT-N54_eeBkPQR0tUs2LLil2CSxydr0_4OThUSefUZ_3rL60zaj-TR6sChNEtRkFY5tn5CIMzStsjlPlDpyPoaSKlMVIAbaMDoyN5NODu7SnjO8ab239Lg_MEz-0fGYc46gBnDODRk_kEhNm7ztAEHfW4BILJhfj3HBuiX_qonQ2AnuBt_yrB2bXRqhtSWoF1EdAYdwb8-pYU9paALj3NKw3oAG-H9qV5OrRlDs7CcbhUHJnYpvAzeWJiEwAt0AWn2RCcR_IEZ4IjWfIYCVjJPw7DhP1hKPMDSQEfrNhyJpuMFTsmniCwYpsz2RR3vNYFk41cccnhPOp-hP7qcEdXc0lBdD9AzS95Jvj6fUPMZANjjHNgsmGiZaI9OtU5E1Pnj0y0fzLR_vTy22k4lafu3xHt7S0eJrY3U3u0EPGd_uvnVZ-DeLVljcS4GMAN30YWR4xfdaaJqqWqZIUJ1VlZFdWjKDZlMtZKDEiyQ953mcpw2Mi8oKpSVMpKlD0muhZcSF6JKuNZmZVpJbPqMes7sRGZ4LlkOacJtUmN-ZhW0YkOYaG63Gwes8RgRyZcl0AIsiPaniaykQnBxBMTosP-naxismleds-vv6-FYpf4eqV76JZjYDk3OsTwz4Coo6H6fl82_78xyeJN_SUlHcelS3s3MdGuzPfjYfbuRH1kor3qCEy0Vyl_BQAA__8JKzXM">