<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/81840>81840</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[AArch64][SVE] Missed opportunity to fold cmp(stepvector, splat) to whilelt
</td>
</tr>
<tr>
<th>Labels</th>
<td>
backend:AArch64,
SVE,
missed-optimization
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
MacDue
</td>
</tr>
</table>
<pre>
Currently, the platform-agnostic lowerings of `vector.create_mask` in MLIR for scalable vectors (see https://mlir.llvm.org/docs/Dialects/Vector/#vectorcreate_mask-vectorcreatemaskop), use a step vector compared against a splat of the upper bound, and resulting AArch64 assembly does that too. For SVE it should be possible to lower this to a `whilelt` instruction instead.
```llvm
define <vscale x 4 x i1> @mlir_create_mask_lowering(i64 %0) {
%2 = tail call <vscale x 4 x i32> @llvm.experimental.stepvector.nxv4i32()
%3 = trunc i64 %0 to i32
%4 = insertelement <vscale x 4 x i32> undef, i32 %3, i64 0
%5 = shufflevector <vscale x 4 x i32> %4, <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer
%6 = icmp slt <vscale x 4 x i32> %2, %5
ret <vscale x 4 x i1> %6
}
declare <vscale x 4 x i32> @llvm.experimental.stepvector.nxv4i32()
```
Current lowering (-O3):
```asm
mlir_create_mask_lowering: // @mlir_create_mask_lowering
ptrue p0.s
index z0.s, #0, #1
mov z1.s, w0
cmpgt p0.s, p0/z, z1.s, z0.s
ret
```
See: https://godbolt.org/z/r197eEbY9
It should be possible to lower this directly to a `whilelt` instruction instead:
```asm
mlir_create_mask_lowering:
whilelt p0.s, wzr, w0
ret
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJycVV1r6zgQ_TXKy5Agy3YcP-ShbRq4sGVhC4V9KrI8jrVXlowk9yO_fhnbaZN-XC7XlEqRRmc0Z45mZAj6YBG3LL9m-W4hh9g6v72TajfgonL16_Zm8B5tNK9M3EBsEXojY-N8t5QH60LUCox7Rq_tIYBrgK35E6ro_Ep5lBEfOxl-sjUHbeHurx__QOM8BCWNrAzCZBqAiU1AhDbGPrD0iok9E_vOaL8y5qlbOX9gYl87FZjY77Q0qCJNH8bjo3U6QZ05XZ6v0ILrmSgpjCEgSAgR-_kCoFzXS481yIPUNkTapkApIgp66Hv0ULnB1gQgbQ0ew2Citge4uvKqXWcgQ8CuMq9QOwwQWxkhOreCvfNw_3ALOkJo3WBqqBB6F4ImCqKbCITY6kC_JHH43GqDJk7EhegHFbWz4xxlvWJ8x_jV_H_Npz-ialqqsdEWgaU3T0Q1wgtk8AI6YektsIwTs49nVD2eUsjERq8zYCLnTJTAiusJEGhJAEt3EKU2oKQxn-FTMeOPScOXHr3u0EZpVkT2rAv78pSRqdhQOt7h0wneD1bB6RLEB9m-W2WjlbYBfUSDBP_dRQZbY0Pp0qkYHYzzdQb8DC8f8UI7NI3BWQ3fBSbyjCC-2e6dDs7-wuCI3mmro5ZGH9GfXWI9BaW6HoL5Nh5KwQgv8vx02OMX5slsvZ4FUuzO9VKjMtJ_oY4_SB98kOC5n7lyvJUHeuTLv1NKevpRujLMyv1emekVXHxTjfi1mmeSpq-PfkAa-Spc7mhb4wsAHGlnJDjl85hcWnbuaRyPyWT5zC_3Vdcf4smHuIGeM7E_0ux04vjJu8f4JYv3iBT0ZU08uLpyJs4V8cjE3idlgbfVv-U59z9-o9TU2qOK5vU3a84fJ-0y2tnJG0PPR_8VkZ9JWdTbtC7TUi5wmxS8KMtksy4W7TZVnCdFhmVTNI3IRCnLhpdNU-Vc5qVKFnoruMi4SPKEJwkvVtgUVZltNjyvVaGajGUcO6nNW7NZ6BAG3G6STcYXRlZowtgihaik-omWyJirPhP0JpkQ9w-3b_NOh4D10vVRd_ooiUbay3cLvyUfy2o4BHpoOsTw7jXqaMZefMLOdyy_JuB8B3cjJri-dz4OVscxcY0zNamOGujbGyVCx_ZFLzS6E-eLwZvtBznp2A7VSrmOif3YP6Zh2Xv3H6rIxH6kgprtyMb_AQAA__98BXlp">