[PATCH] D155520: [LV] Complete load groups and release store groups in presence of dependency

Anna Thomas via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 19 09:10:10 PDT 2023


anna marked an inline comment as done.
anna added inline comments.


================
Comment at: llvm/lib/Analysis/VectorUtils.cpp:1194
       // the boundaries of the (2, 4) group.
-      if (!canReorderMemAccessesForInterleavedGroups(&*AI, &*BI)) {
+      //
+      //
----------------
Ayal wrote:
> nit: one empty line suffices?
> 
> Would it be clearer to do something like:
> 
> ```
>       auto GroupA = getInterleaveGroup(A);
>       if (GroupA && StoreGroups.contains(GroupA) &&
>           !canReorderMemAccessesForInterleavedGroups(&*AI, &*BI)) {
>           LLVM_DEBUG(dbgs() << "LV: Invalidated store group due to "
>                                "dependence between " << *A << " and "<< *B << '\n');
>         StoreGroups.remove(GroupA);
>         releaseGroup(GroupA);
>       }
>       if (A->mayWriteToMemory() && GroupB && LoadGroups.contains(GroupB)) {
>         bool CompleteGroupB = false;
>         for (uint32_t Index = 0; Index < GroupB->getFactor(); ++Index) {
>           Instruction *MemberOfGroupB = GroupB->getMember(Index);
>           if (MemberOfGroupB &&
>               !canReorderMemAccessesForInterleavedGroups(
>                   &*AI, &*AccessStrideInfo.find(MemberOfGroupB))) {
>             CompleteGroupB = true;
>             break;
>           }
>         }
>         if (CompleteGroupB) {
>           LLVM_DEBUG(dbgs() << "LV: Marking interleave group for " << *B
>                             << " as complete.\n");
>             CompletedLoadGroups.insert(GroupB);
>           break;
>         }
>       }
> 
> ```
> Would be nicer with an `if (std::any_of(...)) { ...; break; }`, with support from InterleaveGroup to Iterate over its members.
> 
> A test that requires both releasing GroupA and completing GroupB would help emphasize that the former must precede the latter, due to its `break`.
Actually, this wouldn't be enough AFAICT (it fixes the first test case added as a follow-on to the original review: pr63602_2). The test I added in `interleaved-accesses-sink-store-across-load.ll` won't be fixed with this. The reason is we need to "release store group" if there is a dependency between AI and *any load in the (load) GroupB*. 
So, this patch does two things:
1. retains the order that store group is released first and LoadGroup is marked completed second (we just needed to do both, the order doesn't matter since I left the break where it is: line 1248).
2. Make sure that we check all loads in GroupB for a dependency against `AI`, for both `GroupA` release and `GroupB` completion.

I've added more details in that test case comment below to show why #2 is needed. 

> Would be nicer with an if (std::any_of(...)) { ...; break; }, with support from InterleaveGroup to Iterate over its members.
I had that locally, but when recording a `dependentInst` didn't find a good need for it. 



================
Comment at: llvm/lib/Analysis/VectorUtils.cpp:1216
+      }
+      if (!DependentInst) {
         // If a dependence exists and A is already in a group, we know that A
----------------
Ayal wrote:
> Should be `if (DependentInst)`?
yes! thanks! I've updated the tests to corrected ones as well. 


================
Comment at: llvm/lib/Analysis/VectorUtils.cpp:1248
         // will only form a group with instructions that it precedes.
         break;
       }
----------------
Ayal wrote:
> Suffice to place this `break` only when indicating that (load) GroupB is completed; can continue to expand (store) GroupB even if (store) GroupA is released?
yes, good point. 

Btw, I also found a third miscompile (which was the one I was initially trying to root-cause before running into the two miscompiles I'm trying to fix here :)). 
I want to fix that separately from this review after adding a test case. It is caused when`A` is a store that has a dependency with `GroupB` but `A` is not yet part of an interleave group. 


================
Comment at: llvm/test/Transforms/LoopVectorize/X86/interleaved-accesses-sink-store-across-load.ll:8
 ; that store into the last store (by creating an interleaved store group). This
-; means the loaded %l2 has incorrect value.
-; We do not release this store group correctly because the next interleave group
-; chosen compares only the memory access of last load in program (%l3) against the dependent store location
-; (%gep.iv.1.plus.2) and they are different, thereby incorrectly assuming no
-; dependency. We need to compare against all loads in that interleaved group
-; (%l2 is part of it).
+; means the loaded %l2 will have incorrect value.
 define void @avoid_sinking_store_across_load(ptr %arr) {
----------------
Ayal wrote:
> This additional/distinct testcase of preventing the sinking of a store is fixed by the same patch that compares all member of a load-groupB with a storeA, right?
Unfortunately, no. That only marks `GroupB` for completion, thereby preventing hoisting of a load to an earlier load group. 

To show what happens in this test case: we reverse traverse the memory access. So, the first interleave group created is:
```
store i32 %add, ptr %gep.iv.2 <-- first access
store i32 %mul, ptr %gep.iv.1.plus.2 <-- added into the (store) groupB. 
```
Let's call this StoreGroup1.

We next come to B: `%l3 = load i32, ptr %gep.iv.2`
Interleave group created with that inst.
We create next interleave group:
```
%l3 = load i32, ptr %gep.iv.2
%l2 = load i32, ptr %gep.iv.1.plus.2
```
Call this `LoadGroup1`.

Wit that same `B`, we continue `A` accesses with A as `store i32 %mul, ptr %gep.iv.1.plus.2`.

With the patch I have we will do two things:
1. release `StoreGroup1` because we see the store has a dependency with `%l2`
2. Mark `LoadGroup1` as complete (and break from this inner loop traversing `AI`).

If we were to *only* compare `BI` when deciding for store release:
```
if (GroupA && StoreGroups.contains(GroupA) &&
    !canReorderMemAccessesForInterleavedGroups(&*AI, &*BI))
```
then, we will not release StoreGroup1 since BI (%l3) has no dependency. 

Which means even though the load group is correctly completed (and we will not hoist %l2 to location of %l1), we will sink the store (store i32 %mul, ptr %gep.iv.1.plus.2) below the dependent load (%l2). 



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155520/new/

https://reviews.llvm.org/D155520



More information about the llvm-commits mailing list