[llvm] [LoadStoreVectorizer] Postprocess and merge equivalence classes (PR #114501)

Wed Nov 13 16:02:51 PST 2024

v-klochkov wrote:

> > Regarding the case itself. The original workload was way more complex, the LIT only shows the idea of it, it is a huge simplification. For the 1st case/pattern I saw `7x` + `1x` vectors (for loads and stores), which ruined performance - `7x` vectors had to be lowered/legalized to `4x`+`2x`+`1x` producing 4 mem-ops + extra reg-to-reg moves and swizzles instead of a single `8x` mem-operation.
> 
> Can you add a case that's more representative? You can try llvm-reduce to get a sample

It may be challenging due to couple of reasons. There is a legal consideration: using llvm-reduce on non-disclosed code might not make it shareable. and second - the original input code is for a target that is not published yet, manually removing those target specifics details may result into pretty synthetic test again.

The existing LIT test is designed to be easily understandable. Adding more variations into the address-operation (use different offsets in getelementptr, or using slightly different operations - not only getelementptr), keeps the idea of the test the same.

Perhaps, we can both agree that the 6-steps-depth-limited search of the underlying object can easily lead to situations where equivalence classes have underlying objects that are only one step or level apart.

@arsenm - I am open to adding more variations in address operations if you believe it would improve the LIT test. Please let me know your thoughts first.

https://github.com/llvm/llvm-project/pull/114501