[PATCH] D118059: [OpenMP][Fix] Properly inherit calling convention
Joseph Huber via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 24 11:09:43 PST 2022
jhuber6 created this revision.
jhuber6 added reviewers: tianshilei1992, jdoerfert.
Herald added subscribers: ormris, guansong, hiraditya, yaxunl.
jhuber6 requested review of this revision.
Herald added subscribers: llvm-commits, sstefan1.
Herald added a project: LLVM.
Previously in OpenMPOpt we did not correctly inherit the calling
convention of the callee when creating new OpenMP runtime calls. This
created issues when the calling convention was changed during
`GlobalOpt` but a new call was creating without the correct calling
convention. This lead to the call being replaced with a poison value in
`InstCombine` due to undefined behaviour and causing large portions of
the program to be incorrectly eliminated. This patch correctly inherits
the existing calling convention from the callee.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D118059
Files:
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
llvm/test/Transforms/OpenMP/spmdization.ll
Index: llvm/test/Transforms/OpenMP/spmdization.ll
===================================================================
--- llvm/test/Transforms/OpenMP/spmdization.ll
+++ llvm/test/Transforms/OpenMP/spmdization.ll
@@ -1430,7 +1430,7 @@
; AMDGPU-NEXT: [[X_ON_STACK:%.*]] = bitcast i8* addrspacecast (i8 addrspace(3)* getelementptr inbounds ([4 x i8], [4 x i8] addrspace(3)* @x.1, i32 0, i32 0) to i8*) to i32*
; AMDGPU-NEXT: br label [[REGION_CHECK_TID:%.*]]
; AMDGPU: region.check.tid:
-; AMDGPU-NEXT: [[TMP0:%.*]] = call i32 @__kmpc_get_hardware_thread_id_in_block()
+; AMDGPU-NEXT: [[TMP0:%.*]] = call fastcc i32 @__kmpc_get_hardware_thread_id_in_block()
; AMDGPU-NEXT: [[TMP1:%.*]] = icmp eq i32 [[TMP0]], 0
; AMDGPU-NEXT: br i1 [[TMP1]], label [[REGION_GUARDED:%.*]], label [[REGION_BARRIER:%.*]]
; AMDGPU: region.guarded:
@@ -1466,7 +1466,7 @@
; NVPTX-NEXT: [[X_ON_STACK:%.*]] = bitcast i8* addrspacecast (i8 addrspace(3)* getelementptr inbounds ([4 x i8], [4 x i8] addrspace(3)* @x1, i32 0, i32 0) to i8*) to i32*
; NVPTX-NEXT: br label [[REGION_CHECK_TID:%.*]]
; NVPTX: region.check.tid:
-; NVPTX-NEXT: [[TMP0:%.*]] = call i32 @__kmpc_get_hardware_thread_id_in_block()
+; NVPTX-NEXT: [[TMP0:%.*]] = call fastcc i32 @__kmpc_get_hardware_thread_id_in_block()
; NVPTX-NEXT: [[TMP1:%.*]] = icmp eq i32 [[TMP0]], 0
; NVPTX-NEXT: br i1 [[TMP1]], label [[REGION_GUARDED:%.*]], label [[REGION_BARRIER:%.*]]
; NVPTX: region.guarded:
@@ -2328,6 +2328,8 @@
ret void
}
+declare fastcc i32 @__kmpc_get_hardware_thread_id_in_block();
+
attributes #0 = { alwaysinline convergent norecurse nounwind }
attributes #1 = { argmemonly mustprogress nofree nosync nounwind willreturn }
attributes #2 = { convergent }
Index: llvm/lib/Transforms/IPO/OpenMPOpt.cpp
===================================================================
--- llvm/lib/Transforms/IPO/OpenMPOpt.cpp
+++ llvm/lib/Transforms/IPO/OpenMPOpt.cpp
@@ -3241,8 +3241,10 @@
FunctionCallee HardwareTidFn =
OMPInfoCache.OMPBuilder.getOrCreateRuntimeFunction(
M, OMPRTL___kmpc_get_hardware_thread_id_in_block);
- Value *Tid =
+ CallInst *Tid =
OMPInfoCache.OMPBuilder.Builder.CreateCall(HardwareTidFn, {});
+ if (Function *Fn = dyn_cast<Function>(HardwareTidFn.getCallee()))
+ Tid->setCallingConv(Fn->getCallingConv());
Value *TidCheck = OMPInfoCache.OMPBuilder.Builder.CreateIsNull(Tid);
OMPInfoCache.OMPBuilder.Builder
.CreateCondBr(TidCheck, RegionStartBB, RegionBarrierBB)
@@ -3255,8 +3257,11 @@
M, OMPRTL___kmpc_barrier_simple_spmd);
OMPInfoCache.OMPBuilder.updateToLocation(InsertPointTy(
RegionBarrierBB, RegionBarrierBB->getFirstInsertionPt()));
- OMPInfoCache.OMPBuilder.Builder.CreateCall(BarrierFn, {Ident, Tid})
- ->setDebugLoc(DL);
+ CallInst *Barrier =
+ OMPInfoCache.OMPBuilder.Builder.CreateCall(BarrierFn, {Ident, Tid});
+ Barrier->setDebugLoc(DL);
+ if (Function *Fn = dyn_cast<Function>(BarrierFn.getCallee()))
+ Barrier->setCallingConv(Fn->getCallingConv());
// Second barrier ensures workers have read broadcast values.
if (HasBroadcastValues)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D118059.402597.patch
Type: text/x-patch
Size: 3273 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220124/db5f4d56/attachment.bin>
More information about the llvm-commits
mailing list