[clang] [flang] [llvm] [mlir] [OpenMP][flang] Add initial support for by-ref reductions on the GPU (PR #165714)
Kareem Ergawy via llvm-commits
llvm-commits at lists.llvm.org
Thu Nov 20 23:34:36 PST 2025
================
@@ -2591,8 +2597,49 @@ void OpenMPIRBuilder::emitReductionListCopy(
// Now that all active lanes have read the element in the
// Reduce list, shuffle over the value from the remote lane.
if (ShuffleInElement) {
- shuffleAndStore(AllocaIP, SrcElementAddr, DestElementAddr, RI.ElementType,
- RemoteLaneOffset, ReductionArrayTy);
+ Type *ShuffleType = RI.ElementType;
+ Value *ShuffleSrcAddr = SrcElementAddr;
+ Value *ShuffleDestAddr = DestElementAddr;
+ Value *Zero = ConstantInt::get(Builder.getInt32Ty(), 0);
+ AllocaInst *LocalStorage = nullptr;
+
+ if (IsByRefElem) {
+ assert(RI.ByRefElementType && "Expected by-ref element type to be set");
+ assert(RI.ByRefAllocatedType &&
+ "Expected by-ref allocated type to be set");
+ // For by-ref reductions, we need to copy from the remote lane the
+ // actual value of the partial reduction computed by that remote lane;
+ // rather than, for example, a pointer to that data or, even worse, a
+ // pointer to the descriptor of the by-ref reduction element.
+ ShuffleType = RI.ByRefElementType;
+
+ ShuffleSrcAddr = Builder.CreateGEP(RI.ByRefAllocatedType,
----------------
ergawy wrote:
> Btw, what would the default implementation be for the data_addr region, just a `yield %arg0`?
I think it should be empty. For value reductions, it won't be printed and won't be used by any parts of the code-gen.
https://github.com/llvm/llvm-project/pull/165714
More information about the llvm-commits
mailing list