[clang] [llvm] [mlir] [MLIR][OpenMP] Add codegen for teams reductions (PR #133310)
Sergio Afonso via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 31 06:44:27 PDT 2025
================
@@ -4434,10 +4497,24 @@ getKmpcForStaticLoopForType(Type *Ty, OpenMPIRBuilder *OMPBuilder,
static void createTargetLoopWorkshareCall(
OpenMPIRBuilder *OMPBuilder, WorksharingLoopType LoopType,
BasicBlock *InsertBlock, Value *Ident, Value *LoopBodyArg,
- Type *ParallelTaskPtr, Value *TripCount, Function &LoopBodyFn) {
- Type *TripCountTy = TripCount->getType();
+ Type *ParallelTaskPtr, Value *TripCountOrig, Function &LoopBodyFn) {
Module &M = OMPBuilder->M;
IRBuilder<> &Builder = OMPBuilder->Builder;
+ Value *TripCount = TripCountOrig;
+ // The trip count is 1 larger than it should be for GPU, this is because
+ // of how the deviceRTL functions work with clang. TODO: make the trip
+ // count consistent between both so we don't have to subtract one here.
+ if (OMPBuilder->Config.isGPU()) {
+ Builder.restoreIP({InsertBlock, std::prev(InsertBlock->end())});
+ LLVMContext &Ctx = M.getContext();
+ Type *IVTy = TripCountOrig->getType();
+ Type *InternalIVTy = IVTy->getIntegerBitWidth() <= 32
+ ? Type::getInt32Ty(Ctx)
+ : Type::getInt64Ty(Ctx);
+ Constant *One = ConstantInt::get(InternalIVTy, 1);
+ TripCount = Builder.CreateSub(TripCountOrig, One, "modified_trip_count");
+ }
+ Type *TripCountTy = TripCount->getType();
----------------
skatrak wrote:
Either way, I think the comment you added is slightly misleading because the only code path in clang that would result in it producing calls to the `__kmpc_*_loop` DeviceRTL functions (code path that seems to be currently broken in several spots, so I have not been able to get it to successfully do it) also executes this, so every call to these DeviceRTL functions will subtract one, regardless of it being triggered by clang or flang.
https://github.com/llvm/llvm-project/pull/133310
More information about the llvm-commits
mailing list