<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/155769>155769</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            AMDGPU does not use write2 with AGPR inputs
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            backend:AMDGPU,
            missed-optimization
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          arsenm
      </td>
    </tr>
</table>

<pre>
    https://github.com/llvm/llvm-project/pull/155765 adds some tests which demonstrate suboptimal ds_write2* formation. Both write2 data operands need to be the same subclass, AGPR or VGPR. They cannot be individually controlled. 

Additionally, the pass does not try to constrain AV input registers to VGPR, which would also enable more cases. e.g.

```
define void @ds_write2_b32_av_v(ptr addrspace(3) %lds) #0 {
 %gep.0 = getelementptr inbounds [512 x float], ptr addrspace(3) %lds, i32 0, i32 10
  %gep.1 = getelementptr inbounds [512 x float], ptr addrspace(3) %lds, i32 0, i32 24
  %av0 = call i32 asm "; def $0", "=^VA"()
  %v0 = call i32 asm "; def $0", "=v"()
  store i32 %av0, ptr addrspace(3) %gep.0
  store i32 %v0, ptr addrspace(3) %gep.1
  ret void
}
```

We can try to constrainRegClass on %v0's register to VGPR in order to enable the fold.

To fix this we should first:
1 - try to constrain the AV classes to VGPRs with the existing instructions
2 - Define new AGPR pseudoinstruction variants of DS_WRITE2* with AGPR data operands
3 - Handle ds_write2* instructions in AMDGPURewriteAGPRCopyMFMA. This has the same problem as MFMAs


</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJy0VVFvqzgT_TXmZVTkmJCQBx5oc9Pve6hUdbu9j5XBE_CusZHHJO3--pVN095upbu6DytFCsx4zmGGMwdJpHuLWLPympX7TM5hcL6WntCOWevUaz2EMBErGiYOTBx6HYa5zTs3MnEw5nT5u5q8-wO7wMRhmo1h4rAqy-2mBKkUAbkRISAFgvOguwEUjs5S8DIg0Ny6KehRGlD0fPY6oGCigaPzowza2RyuXRhgyYCSQYKb0EurCCyiguCgRQgDAskxAXZGEjFxA83t_QM4D0-39w85PA74Cp201oVYoa3SJ61macwrdM4G74xBlQPjDeNNo5SO_DEdsSLBJIlAOSSIGMG_RvJu6UVbaJ5A22kO4LHXFNBTzEfyCLD0fnazUSANOUArW4MwOo_QSULKAfM-X-jZhr_9eKPwqC3CyWkFbM3f5_TcFuJZnp5PTFRT8HHanibZIRNVwcQOmCiNouWq4MC214w3MdrjlHNgxR56DGhwRBsigratm-NkWXldrgS8wNE4GVi5jx38lOMGdCGAXy5W8cHhwrX6T7nE-sIlT0tXnTQmpSSNwIRgxTUoPAITax5vxc0S3bPy21OTIhUTuwvMr6KcPiNQiK80Fi6P9LN-0pv4WvXvRatU5DEkXUTFbPf_0A3jzfcoLftFqg_Y38QlAWcvbFt6l-1FtaAtOK-WwJta4xocnVFvMn10cNQvEAZNcEagIcn7qD2F6Bq8WcHV10WJIM0TpD3F9yUhOOswpCS-aAra9qBjydzFRSTGGwFXsF-2weJ52e-JcFbuh4Nwkl5LGwjcEfa_PX9_-P_jt2QqCT8VffIRxpsCruB_0iqDn23oR_44juZuf3v_-wOmExHpxk2vd4e7JtqLJhgkfVjR5F1rcARJEI_QMrJM1YXaFTuZYb3alptis16Xm2youay6atN1fKUqsRPHHRbtuqqwaFdHJVWV6VpwUfJKVHzLebnJC7VBsVtvqs26KFaVYmuOo9Qmj6acO99nmmjGOrnxLjOyRUPJ7YVoZfcnWsWKZulpETQTYtREqK6SK-u_kgfHXLnPfJ3Mvp17YmtuNAX6YAo6GKwXrA-PnAkv1v0x_GSSlM3e_Pr3JTVEly_MLjvV4u8AAAD__-XpIRg">