[PATCH] D68667: [SLP] respect target register width for GEP vectorization (PR43578)

Tue Jun 23 16:11:50 PDT 2020

fhahn added a comment.

In D68667#2107167 <https://reviews.llvm.org/D68667#2107167>, @spatel wrote:

> In D68667#2107079 <https://reviews.llvm.org/D68667#2107079>, @fhahn wrote:
>
> > In D68667#2106962 <https://reviews.llvm.org/D68667#2106962>, @fhahn wrote:
> >
> > > I tracked down a 7% regression in h264  on AArch64 -O3 LTO & PGO to this commit. The regressions in the aarch64 tests seem a bit suspicious and from the description the changes seem unintentional (4 x i32 vectors should be perfectly legal on AArch64). I'll take a look to see what's going on.
> >
> >
> > Oh I now see what's going on. The actual compute is done on i64 x 4.
>
>
> But not in the getelementptr_4x32() test, right? Maybe we need to refine getVectorElementSize() in some way.

Initially I was looking at the @test2, but the more interesting one is indeed `getelementptr_4x32`. I think the issue might be that we use the width of the pointer to limit the list size, rather than the width of the index computations. IIUC we only vectorize the index computations, so I think it would make sense to limit the width based on the GEP index width, rather than the GEP itself. I put up D82418 <https://reviews.llvm.org/D82418>, which restores `getelementptr_4x32` and also catches the important h264 pattern on AArch64, while not regressing the test case on X86.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68667/new/

https://reviews.llvm.org/D68667