[llvm] [LoadStoreVectorizer] Postprocess and merge equivalence classes (PR #114501)

Wed Nov 13 15:12:23 PST 2024

arsenm wrote:

> Regarding the case itself. The original workload was way more complex, the LIT only shows the idea of it, it is a huge simplification. For the 1st case/pattern I saw `7x` + `1x` vectors (for loads and stores), which ruined performance - `7x` vectors had to be lowered/legalized to `4x`+`2x`+`1x` producing 4 mem-ops + extra reg-to-reg moves and swizzles instead of a single `8x` mem-operation.
> 

Can you add a case that's more representative? You can try llvm-reduce to get a sample 

https://github.com/llvm/llvm-project/pull/114501