[PATCH] D148388: [LV] Model stride in VPWidenMemoryInstructionRecipe [nfc]

Mon May 1 08:15:02 PDT 2023

reames added a comment.

In D148388#4287713 <https://reviews.llvm.org/D148388#4287713>, @fhahn wrote:

> Do you have any particular examples in mind? At the moment, strides > 1 would be handled as interleave group I think and maybe that would also work.

Constant strides greater than 8 currently end up as masked loads or stores, and the addressing for a masked load vs a strided load is different.  The latter only uses the first lane of the pointer vector.

We could also use it for strided stores (even less than stride 8).  On a target with wide loads, but not masking, this would enable strided patterns entirely.  On a target with wide memory ops and masking there can still be a performance difference between strided access and masking.

(For context, RISCV has a native strided load and store instruction.)

And all of the above is really just a building block to removing the stride==1 speculation and handling runtime strides.  This is the case I actually care about, I just need to get the codegen part into acceptable shape first.  (This shows up in spec2017 x264 in several cases.)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148388/new/

https://reviews.llvm.org/D148388