[PATCH] D115302: [AA] Teach AA about convergent instrinsics that affect loads/stores.

Artem Belevich via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Dec 7 16:03:51 PST 2021


tra created this revision.
tra added reviewers: efriedma, nikic.
Herald added subscribers: jeroen.dobbelaere, bixia, hiraditya.
tra requested review of this revision.
Herald added a project: LLVM.

We do not have a good way to represent memory accesses that may happen from
other threads on a GPU. Convergent intrinsics that may modify unspecified memory
is the best proxy we have at the moment. Treat such intrinsics as ModRef.

This fixes a miscompile in NVPTX that's been exposed by
https://github.com/llvm/llvm-project/commit/fa5d31f825699b0fe858d4f432bd3fbbbec523c8
and which resulted in shared memory stores/loads being incorrectly eliminated.
https://lists.llvm.org/pipermail/llvm-dev/2021-November/154060.html


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D115302

Files:
  llvm/lib/Analysis/AliasAnalysis.cpp
  llvm/test/Analysis/GlobalsModRef/intrinsic_convergent.ll


Index: llvm/test/Analysis/GlobalsModRef/intrinsic_convergent.ll
===================================================================
--- /dev/null
+++ llvm/test/Analysis/GlobalsModRef/intrinsic_convergent.ll
@@ -0,0 +1,33 @@
+; RUN: opt -globals-aa -gvn -S < %s | FileCheck %s
+target datalayout = "e-i64:64-i128:128-v16:16-v32:32-n16:32:64"
+target triple = "nvptx64-nvidia-cuda"
+
+ at s = internal local_unnamed_addr addrspace(3) global i32 undef, align 4
+
+; CHECK-LABEL: @bar_sync
+; CHECK: store
+; CHECK: tail call void @llvm.nvvm.bar.sync(i32 0)
+; CHECK: load
+define dso_local i32 @bar_sync(i32 %0) local_unnamed_addr {
+  store i32 %0, i32* addrspacecast (i32 addrspace(3)* @s to i32*), align 4
+  tail call void @llvm.nvvm.bar.sync(i32 0)
+  %2 = load i32, i32* addrspacecast (i32 addrspace(3)* @s to i32*), align 4
+  ret i32 %2
+}
+
+declare void @llvm.nvvm.bar.sync(i32) #0
+
+; CHECK-LABEL: @barrier0
+; CHECK: store
+; CHECK: tail call void @llvm.nvvm.barrier0()
+; CHECK: load
+define dso_local i32 @barrier0(i32 %0) local_unnamed_addr  {
+  store i32 %0, i32* addrspacecast (i32 addrspace(3)* @s to i32*), align 4
+  tail call void @llvm.nvvm.barrier0()
+  %2 = load i32, i32* addrspacecast (i32 addrspace(3)* @s to i32*), align 4
+  ret i32 %2
+}
+
+declare void @llvm.nvvm.barrier0() #0
+
+attributes #0 = { convergent nounwind }
Index: llvm/lib/Analysis/AliasAnalysis.cpp
===================================================================
--- llvm/lib/Analysis/AliasAnalysis.cpp
+++ llvm/lib/Analysis/AliasAnalysis.cpp
@@ -42,6 +42,7 @@
 #include "llvm/IR/BasicBlock.h"
 #include "llvm/IR/Instruction.h"
 #include "llvm/IR/Instructions.h"
+#include "llvm/IR/IntrinsicInst.h"
 #include "llvm/IR/Module.h"
 #include "llvm/IR/Type.h"
 #include "llvm/IR/Value.h"
@@ -226,6 +227,18 @@
                                     AAQueryInfo &AAQI) {
   ModRefInfo Result = ModRefInfo::ModRef;
 
+  // Unless convergent intrinsic carries explicit memory access attributes,
+  // assume that it may touch anything.
+  // TODO: This is overly conservative. We may need to query TTI whether
+  // particular intrinsic really affects the given memory location.
+  // E.g. some intrinsics may affect only specific address spaces.
+  if (Call->isConvergent() && isa<IntrinsicInst>(Call)) {
+    Function *F = Call->getCalledFunction();
+    if (!(F->onlyReadsMemory() || F->doesNotAccessMemory() ||
+          F->onlyAccessesArgMemory()))
+      return ModRefInfo::ModRef;
+  }
+
   for (const auto &AA : AAs) {
     Result = intersectModRef(Result, AA->getModRefInfo(Call, Loc, AAQI));
 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D115302.392580.patch
Type: text/x-patch
Size: 2590 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20211208/6bbb6093/attachment.bin>


More information about the llvm-commits mailing list