<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/112925>112925</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            LLVM fails to optimize right shift by constant+saturating narrow to single narrowing right shift on AArch64
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          johnplatts
      </td>
    </tr>
</table>

<pre>
    LLVM fails to optimize the following right shift+narrow operations down to a single narrowing right shift instruction on AArch64:
```
define dso_local noundef <4 x i16> @NarrowShrI32By5(<4 x i32> noundef %0) #0 {
  %2 = ashr <4 x i32> %0, <i32 5, i32 5, i32 5, i32 5>
  %3 = tail call noundef <4 x i16> @llvm.aarch64.neon.sqxtn.v4i16(<4 x i32> %2)
  ret <4 x i16> %3
}

declare <4 x i16> @llvm.aarch64.neon.sqxtn.v4i16(<4 x i32>) #1

define dso_local noundef <4 x i16> @NarrowShrU32By5(<4 x i32> noundef %0) #0 {
  %2 = lshr <4 x i32> %0, <i32 5, i32 5, i32 5, i32 5>
  %3 = tail call noundef <4 x i16> @llvm.aarch64.neon.uqxtn.v4i16(<4 x i32> %2)
  ret <4 x i16> %3
}

declare <4 x i16> @llvm.aarch64.neon.uqxtn.v4i16(<4 x i32>) #1

define dso_local noundef <4 x i16> @NarrowShrI32By5ToU16(<4 x i32> noundef %0) #0 {
  %2 = lshr <4 x i32> %0, <i32 5, i32 5, i32 5, i32 5>
  %3 = tail call noundef <4 x i16> @llvm.aarch64.neon.sqxtun.v4i16(<4 x i32> %2)
  ret <4 x i16> %3
}

declare <4 x i16> @llvm.aarch64.neon.sqxtun.v4i16(<4 x i32>) #1

attributes #0 = { mustprogress nofree norecurse nosync nounwind willreturn memory(none) uwtable "frame-pointer"="non-leaf" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+fp-armv8,+neon,+outline-atomics,+v8a,-fmv" }
attributes #1 = { mustprogress nocallback nofree nosync nounwind willreturn memory(none) }
```

Here is the assembly that is currently generated when the above code is compiled with llc:
```
NarrowShrI32By5:                        // @NarrowShrI32By5
        sshr    v0.4s, v0.4s, #5
        sqxtn v0.4h, v0.4s
        ret
NarrowShrU32By5:                        // @NarrowShrU32By5
        ushr    v0.4s, v0.4s, #5
        uqxtn v0.4h, v0.4s
        ret
NarrowShrI32By5ToU16:                   // @NarrowShrI32By5ToU16
        ushr    v0.4s, v0.4s, #5
        sqxtun v0.4h, v0.4s
        ret
```

The snippet above can be found at https://godbolt.org/z/jq3zMKPz1.

Here is a more optimized version of the above code:
```
NarrowShrI32By5:
        sqshrn  v0.4s, v0.4s, #5
        ret
NarrowShrU32By5:
        uqshrn  v0.4s, v0.4s, #5
 ret
NarrowShrI32By5ToU16:
        sqshrun v0.4s, v0.4s, #5
 ret
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzUV1Fv2zYQ_jX0y0GGRFmO_eCHJK6xYe0wYM1eB4o6SWwpUiWPdp1fP1BWnNqw07Qd0M0IYpr67vjxuyNPJ7xXjUFcseKOFeuJCNRat_pgW9NrQeQnpa32q7dv_3oHtVDaA1mwPalOPSJQi1Bbre1OmQacaloC36qaGL8zwjm7A9ujE6Ss8VDZnYnmArwyjUY4QM5MQRlPLshoA9bA7a2T7XzG8luWrll6y-bp-Df8rLBWBqHy9m9tpdBgbDAV1sDy-xl8BpXNWf4G2Cz9fVjtz9b9mvO7fcH44gmS8wg5GvIiZXwJjOcpsJu7wzoQ5zmwfA3Ctw5ObQ8293FW5RyKOL42yN984TEfPJJQGqTQL7DXettNhRi0mBq0Zuo_fSYz3c4i5nwvkSvjy6eFHNK5R17ko54363Ew6im1cPiDBEb9slPP3xyphx-MlP5vRCr87Ei9QOBfitThTL23Dxe2-H8LV8zr8NNP1jUGlwImiJwqA6Efxc3XUWDogqfe2cah92Bs7RDBWIcyOB9Hfm_koMxOmQp2SmuHFJyBDjvr9owvjDUYVww7EqVGYJzXTnSY9FYZQsc4Z_macW6sSTSKmnEOw8-EnOh7ZZqkE9QeceQCjhhPQn5MemcJJVmXlKGu0SVePeIRvhixJFyDlMg-HB81aNApeQqoUVBw6I8oxu_qPhGu2y4Yv491Ca05jGwgrQwmgmynpD9MbheC8fuk7raD46cQniqcXVM4plop5MdnsV8v8XO6nFa4w_9f0CEoP5Rc4T12pd4DtYLipAzOoSG9h0EUQVjBrkVzQJd2iyBtNdhL2_VKx-eKWtBaXius5-Uyv4UrH8Y3jG8uldjxkBw-Pp5uANim01kU-3nAeH6OjRfWAGifkScIh3TG8-E7eD5c4Bm-gWf4Dp5fXpUXyV7V82Dz3WQPt8rr2F7Mwfctgjeq75GeskoYKOM7YDAVCIKWqPcxo4YtNLYqraapdQ3jm0fGNx8-5Y_vfvvjMZteym0BnXV4fL2sYIvOD6-B9Vkmvz5pzyXwrTOvk-uFDDtPglc4_VoeXOA5xurrLo8iTKpVXi3zpZjgKrvhSz4vlvNi0q6wyKsbjtW8Kuf1fCYxW2YoeVbJtJSLsp6oFU_5LEuzRVbkS55NZ-kyK2eIKc6W2fymZrMUO6H0dKhW1jUT5X3AVZbxJS8mWpSo_dBFcG5wB8PTeP8W64lbRaOkDI2P5U558s9uSJHGaw3Gl31BuQdpjSdhYnvh4z0vKDYPY6dB9uW-4rmXmASnV2epqqgN5VTajvFN5DZ-xfL0ASUxvhl25BnfjFvervg_AQAA__-GdMFj">