[llvm] [AArch64] Add @llvm.experimental.vector.match (PR #101974)

Mon Sep 30 01:46:38 PDT 2024

david-arm wrote:

> if we wanted to remove them we could:
> 
> 1. Only implement the intrinsic for fixed-length vectors.
> 2. Restrict the second argument (the search array) to fixed-length, but let the other vector arguments be fixed or scalable.

Yeah, I think the second option sounds good to me. If you do it this way you can remove the segsize argument too, because the size of the segment is defined by the number of elements in the second vector argument. I agree that puts more pressure on the backend to lower this efficiently, but even with the previous version of the intrinsic we still have to do work to splat the segments across the whole vector at the IR level. I'd hope that the MachineLICM pass would hoist out any loop invariants!

I do think it's more friendly to other targets and makes it easier to implement general lowering code for all targets too, i.e. something like

```
%search.lane.0 = extractelement <4 x i32> %search.vec, i32 0
%search.dup.0 = ... broadcast %search.lane.0 ...
%res.0 = icmp eq <vscale x 4 x i32> %vec, %search.dup.0
...
%res.1 = icmp eq <vscale x 4 x i32> %vec, %search.dup.1
...
%res.pt1 = or <vscale x 4 x i1> %res.0, %res.1
%res.pt2 = or <vscale x 4 x i1> %res.1, %res.2
%res = or <vscale x 4 x i1> %res.pt1, %res.pt2
```

For SVE I suppose you'd have to be clever and use a combination of 64-bit DUPs and ZIPs or something like that to generate the repeating 128-bit pattern!

https://github.com/llvm/llvm-project/pull/101974