[llvm] [LoadStoreVectorizer] Postprocess and merge equivalence classes (PR #114501)
Vyacheslav Klochkov via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 8 14:17:14 PST 2024
v-klochkov wrote:
> > Using any fixed lookup depth can result into creation of multiple equivalence classes that only differ by 1-level indirection bases.
>
> The test IR doesn't look canonical / optimal. Other passes are expected to have rewritten the addressing expressions to be a simpler form. If I run your examples through -O3, it mostly vectorizes with some quirks:
>
> 1. have to hackily avoid a memset that forms
> 2. Ends up with a scalar store + a 7x vector store
@arsenm : Matt - thank you for the review.
I added an incremental commit https://github.com/llvm/llvm-project/pull/114501/commits/427983a9378f8612c054ad219023564ecf803643 to address all of your comments.
Regarding the case itself. The original workload was way more complex, the LIT only shows the idea of it, it is a huge simplification.
For the 1st case/pattern I saw `7x` + `1x` vectors (for loads and stores), which ruined performance - `7x` vectors had to be lowered/legalized to `4x`+`2x`+`1x` producing 4 mem-ops + extra reg-to-reg moves and swizzles instead of a single `8x` mem-operation.
The 2nd pattern in LIT shows potentially worse situation, giving 8 equivalence classes having 1 scalar mem-operation in each.
https://github.com/llvm/llvm-project/pull/114501
More information about the llvm-commits
mailing list