[llvm] [OpenMPOpt] Fix incorrect end-of-kernel barrier removal (PR #65670)

Johannes Doerfert via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 8 14:29:51 PDT 2023


================
@@ -2676,17 +2676,19 @@ struct AAExecutionDomainFunction : public AAExecutionDomain {
       if (!ED.EncounteredAssumes.empty() && !A.isModulePass())
         return;
 
-      // We can remove this barrier, if it is one, or all aligned barriers
-      // reaching the kernel end. In the latter case we can transitively work
-      // our way back until we find a barrier that guards a side-effect if we
-      // are dealing with the kernel end here.
+      // We can remove this barrier, if it is one, or aligned barriers reaching
+      // the kernel end (if CB is nullptr). Aligned barriers reaching the kernel
+      // end may have other successors besides the kernel end (especially if
+      // they're in loops) with non-local side-effects, so those barriers can
+      // only be removed if they also only reach the kernel end. If those
+      // barriers have other barriers reaching them, those can be transitively
+      // removed as well.
       if (CB) {
         DeletedBarriers.insert(CB);
         A.deleteAfterManifest(*CB);
         ++NumBarriersEliminated;
         Changed = ChangeStatus::CHANGED;
       } else if (!ED.AlignedBarriers.empty()) {
----------------
jdoerfert wrote:

First, can we have a few more tests though (and pre-commit the tests please). 
```
│ define void @loop_barrier2() "kernel" {
│ entry:
│   br label %loop
│
│ loop:
│   %i = phi i32 [ 0, %entry ], [ %i.next, %loop ]
│   call void @unknown()
│   call void @aligned_barrier()
│   %i.next = add nuw nsw i32 %i, 1
│   %cond = icmp ne i32 %i.next, 128
│   br i1 %cond, label %loop, label %exit
│
│ exit:
│   call void @aligned_barrier()
│   call void @aligned_barrier()
│   call void @aligned_barrier()
│   call void @aligned_barrier()
│   ret void
│ }
│
```
Also add an unknown call in between the 4 aligned barriers.
Then, replace `unknown` with a store to a generic global.
Now the fix you proposed fails and we race on that global, at least if we put an explicit barrier at the end, e.g.,:
```
│ define void @loop_barrier3() "kernel" {

│ entry:
│   br label %loop
│
│ loop:
│   %i = phi i32 [ 0, %entry ], [ %i.next, %loop ]
│   store i32 %i, ptr @G1
│   call void @aligned_barrier()
│   %i.next = add nuw nsw i32 %i, 1
│   %cond = icmp ne i32 %i.next, 128
│   br i1 %cond, label %loop, label %exit
│
│ exit:
│   call void @aligned_barrier()
│   ret void
│ }
```

---

I'm still thinking about this.

https://github.com/llvm/llvm-project/pull/65670


More information about the llvm-commits mailing list