[llvm] VectorWiden pass to widen aleady vectorized instrctions (PR #67029)

Fri Sep 29 03:48:53 PDT 2023

sdesmalen-arm wrote:

To add a bit of context here, the main motivator here is making use of the multi-vector instructions added in AArch64 SME2/SVE2p1 which can operate on 2 or 4 vectors at a time. For example, loop-vectorized code that has a UF=2 or UF=4 would be a natural input to this pass. However, there are also other use-cases to consider:
* Manually written vector code using intrinsics, where the compiler could further optimise by bundling together single-vector instructions into multi-vector instructions.
* SLP-vectorized code where certain vector operations can be bundled together into a single wider vector operation (this is not specific to SVE or scalable vectors).

The point is that the instructions don't necessarily need to be part of the same expression. For SME2 the multi-vector instructions can just pair up two independent operations with the same opcode. A non-SVE use-case that comes to mind are the NEON load/store-pair (ldp/stp) instructions which are currently paired up only after ISel.

https://github.com/llvm/llvm-project/pull/67029