[PATCH] D138585: [SLPVectorizer] Do Not Move Loads/Stores Beyond Stacksave/Stackrestore Boundaries

Mon Nov 28 17:49:14 PST 2022

qiongsiwu1 added a comment.

In D138585#3953869 <https://reviews.llvm.org/D138585#3953869>, @reames wrote:

> Can you explain why memory dependence wasn't enough to prevent the problematic reordering?  I'm a bit fuzzy on the details - it's been a while - but I though that stacksave and stackrestore were modeled as modifying all memory to prevent this reordering?

I took a closer look at the new test case <https://github.com/llvm/llvm-project/blob/main/llvm/test/Transforms/SLPVectorizer/X86/stackrestore-dependence.ll>, and it seems that the vectorizer never added the four loads <https://github.com/llvm/llvm-project/blob/main/llvm/test/Transforms/SLPVectorizer/X86/stackrestore-dependence.ll#L25-L28> that concerns us to the memory dependency list of the stackrestore. It seemed that for some reason the `load`s memory locations were never treated as aliasing the stackrestore, hence the condition here <https://github.com/llvm/llvm-project/blob/a1066569b827b976bdbee5c92174f145e29e4731/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp#L10090> was never true for the loads when the `DepDest` was the stackrestore. If I hardcode the result of `isAliased` to true as long as `inst2` is a stackrestore (here <https://github.com/llvm/llvm-project/blob/a1066569b827b976bdbee5c92174f145e29e4731/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp#L2787>), the new test passed without the fix included in this patch. I have not looked at `BatchedAA` closely to see what exactly is going on. I will do so if fixing `BatchedAA` is more desirable.

In D138585#3953978 <https://reviews.llvm.org/D138585#3953978>, @ABataev wrote:

> Did a quick search - looks like it does not work for static allocas, only for dynamic

Ah thanks so much for the fast reply! Maybe it is the alias analysis that is causing static allocas to fail, as stated above? BTW, by static alloca, do we mean the allocas that are defined in the entry block of a function and of a fixed size, as documented here <https://discourse.llvm.org/t/how-to-do-a-short-lived-alloca-in-a-loop/63248/2>? In the original test case where this issue was observed, the alloca was not defined in the entry block, but was of a known and fixed size.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138585/new/

https://reviews.llvm.org/D138585