[PATCH] D147336: [IVDescriptors] Add pointer InductionDescriptors with non-constant strides (try 2)

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 3 01:05:07 PDT 2023


dmgreen added a comment.

In D147336#4237233 <https://reviews.llvm.org/D147336#4237233>, @reames wrote:

> I took a bit of a look here, and noticed something interesting.  The LLVM IR after optimization for the path when stride!=1 is nearly identical to the prior version.  However, the runtime check in the assembly appears to have tripped some kind of hoisting optimization in the backend, and as a result, the assembly path for the stride!=1 path is a bit different.
>
> I can see a couple ways of tackling this:
>
> Option 1 - Be more restrictive on stride==1 speculation.  I'd meant to do this anyways, but was expecting that to be the piece that had interesting perf swings.
>
> Option 2 - Investigate the hoisting bit.  (I don't really have the context to do this)

I was looking at another profile that ran with different architecture features and with LTO, where the differences were more pronounced. The main body was vectorized with scalable vectors and remainder was no longer unrolled. That might have been why the differences in performance looked higher that I expected, but that function is relatively hot and called many times, so even just the extra compares can slow things down a bit.

Do you think it would be possible to come up with a heuristic to prevent the 1-stride speculation in this case at least? The overlapping load in https://godbolt.org/z/ToT4W74Yf with a stride of 1 in the same loop at the point of vectorization look like something that is unlike to be helpful in many cases.

(I have looked into single stride before in https://reviews.llvm.org/D71919, where I was running into places where it is not profitable compared to gather/scatter. From what I remember there were a few cases in the llvm-test-suite where was helping, and I didn't push that patch forward as I had no strong motivating example. The base AArch64 doesn't have gather scatter, so this case is a little different).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147336/new/

https://reviews.llvm.org/D147336



More information about the llvm-commits mailing list