<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/141931>141931</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            AMDGPU should not scalarize v2f16 / v2bf16 copysign
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            backend:AMDGPU,
            missed-optimization
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          arsenm
      </td>
    </tr>
</table>

<pre>
    Currently half element copysign is scalarized and produces this ugly expansion:

```
; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 < %s

; s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; s_movk_i32 s4, 0x7fff
; v_bfi_b32 v2, s4, v0, v1
; v_lshrrev_b32_e32 v1, 16, v1
; v_lshrrev_b32_e32 v0, 16, v0
; v_bfi_b32 v0, s4, v0, v1
; s_mov_b32 s4, 0x5040100
; v_perm_b32 v0, v0, v2, s4
; s_setpc_b64 s[30:31]
define <2 x half> @copysign_v2f16(<2 x half> %a, <2 x half> %b) {
  %result = call <2 x half> @llvm.copysign.v2f16(<2 x half> %a, <2 x half> %b)
  ret <2 x half> %result
}

```

If I hack up the vector legalizer's logic, the default expansion finds a vector BFI:

WIth gx803:

```
        s_mov_b32 s4, 0x7fff7fff
        v_bfi_b32 v0, s4, v0, v1
```

With gfx9+, it does worse:
```
        v_and_b32_e32 v1, 0x80008000, v1
        s_mov_b32 s4, 0x7fff7fff
        v_and_or_b32 v0, v0, s4, v1
```


We can trivially extend the existing legal f16 copysign pattern to handle the 2 element case like in the gfx8 output. It's a little more work than that to support the cases where the sign source is a different FP type 

</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJycVU2P4jgQ_TXmUgL5gwA55ADDsurDrkYrjeaIHLuSeHHsyHYy9Pz6ldPQ9Ez3alYrJUHBz_VeuV5VZIymdYgVKQ6kOC7kmDofKhkiun5Re_1cfRpDQJfsM3TSNoAWe3QJlB-e814wEaKSVgbzHTVIp2EIXo8KI6TORBhb-wx4HaSLxjsi9oTO14beLron4gB_ffmTiD1Yq2DZp2AGi0QcZa9b5Zay1_nuooRlr4aRiGPbXEtKgYhPQHgRb0HFAeL5mzRJuQRTr1wifEcJL7OCN2-2vbxZfN3Z--lyNoJDXBP-Ceh12zTNbXU6140514LDxPPiC2Si85O9gmzsQsApA8-YwSwD2OZXMPoGRt9T0n-lnFXPoLvogq4po48gA4b-TZTb857EPUrENKhzvVlDJMVBUCL2gpHiSOheY2Mc5rPmcJ1tQMRvQNb0boLzxJusffczhBcy07z_u85VINsDoXvI7wHjaBMQcQQlrf2Ay9qpX90JV_-LcCYLmD5YfuHPh7E9fmBQun9q4Ak6qS4wDpA6hAlV8gEsttKa7xgI30awvjUqC8gIjY3MSb2aHxrjdAR533s4Pb32w9en1EF73VHxcYvQ8n2hszvvDqXlr93yY0ZfTWZsriXhh4wxCbTHCN98iHgT8aOC6Syd_snZ9LqjlOb7wfNftOZIPryz5U30B3KzYgQlHaRgJiPtPFYSOj0fNl5NTMa1LwWBhm0eI2qQKWFwkDx00mmL8w7-mGUyIlhzQTBuXmqb6w78mIYxreApzaWVYE1KFqH3AfMhXSB1WU0nU44cx2HwIc37c8AI3zoML1SziujHoDDPSwnaNA3msQqnz5CeBwRC9wtdCV2KUi6wYtv1rmC7YrtZdFXJWEMLJmlZs92u2SiFW6YVCr7GEvV2YSpOeUELXjImNoyuFDJRbooNK9ZKy7oga4q9NHY1d5EP7cLEOGLF1qwUbGFljTbOHwHOa6ku6DQR-_0fx98_fyE8jwrCeW9iRL30QzK9-S5THuec589GqHLcZT22MXeqiSk-mJJJFquXWBA7P1oNzqfHVwPmZgbCTzDx-m3lFmOwVZfSELMf-YnwU2tSN9Yr5XvCT5nj9rMcgv8bVSL8NKcWCT_dspsq_k8AAAD___LnENM">