[PATCH] D29879: [OpenMP] Teams reduction on the NVPTX device.
Alexey Bataev via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Feb 13 01:14:14 PST 2017
ABataev added a comment.
General note:
Be more strict with the types. For indeces it's better to use `SizeTy` or `UintPtrTy`, if `Int32Ty` is specified as type of parameter, use `Int32y`, not `IntTy`
================
Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:725
+ /*isVarArg=*/false);
+ llvm::Type *InterWarpCopyTypeParams[] = {CGM.VoidPtrTy, CGM.Int32Ty};
+ auto *InterWarpCopyFnTy =
----------------
You should use `CGM.IntTy` instead of `CGM.Int32Ty` if `(*kmp_InterWarpCopyFctPtr)` really uses `int` type as the second type.
================
Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:1121
+ auto *CurrentOffset =
+ Bld.CreateMul(Bld.getInt64(ElementSizeInChars), ScratchpadIndex);
+ auto *ScratchPadElemAbsolutePtrVal =
----------------
Maybe it's better to use `SizeTy` rather than to rely on the support of `Int64Ty`?
================
Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:1140
+ auto *CurrentOffset =
+ Bld.CreateMul(Bld.getInt64(ElementSizeInChars), ScratchpadIndex);
+ auto *ScratchPadElemAbsolutePtrVal =
----------------
Again, `SizeTy` instead of `Int64Ty`? Or `Intptr`?
================
Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:1201-1209
+ Bld.CreateMul(ScratchpadWidth, Bld.getInt64(ElementSizeInChars)));
+
+ // Take care of global memory alignment for performance
+ ScratchpadBasePtr = Bld.CreateSub(ScratchpadBasePtr, Bld.getInt64(1));
+ ScratchpadBasePtr = Bld.CreateSDiv(ScratchpadBasePtr,
+ Bld.getInt64(GlobalMemoryAlignment));
+ ScratchpadBasePtr = Bld.CreateAdd(ScratchpadBasePtr, Bld.getInt64(1));
----------------
It is better to deduce type from `ScratchpadBasePtr` rather than to rely on support of `Int64Ty`.
================
Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:1247-1256
+ /*Id=*/nullptr, C.IntTy);
+ // Row width of an element in the scratchpad array, typically
+ // the number of teams.
+ ImplicitParamDecl WidthArg(C, /*DC=*/nullptr, SourceLocation(),
+ /*Id=*/nullptr, C.IntTy);
+ // If should_reduce == 1, then it's load AND reduce,
+ // If should_reduce == 0 (or otherwise), then it only loads (+ copy).
----------------
I don't think you should use `IntTy` here, you should use `Int32Ty`
================
Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:1295-1301
+ CGF.Int64Ty);
+
+ Address AddrWidthArg = CGF.GetAddrOfLocalVar(&WidthArg);
+ llvm::Value *WidthVal =
+ Bld.CreateSExt(CGF.EmitLoadOfScalar(AddrWidthArg, /*Volatile=*/false,
+ C.IntTy, SourceLocation()),
+ CGF.Int64Ty);
----------------
Again, better to use `SizeTy`
================
Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:1309
+ llvm::Value *CumulativeElemBasePtr =
+ Bld.CreatePtrToInt(ScratchPadBase, CGM.Int64Ty);
+ Address SrcDataAddr(CumulativeElemBasePtr, CGF.getPointerAlign());
----------------
`IntPtr`
================
Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:1374-1378
+ /*Id=*/nullptr, C.IntTy);
+ // Row width of an element in the scratchpad array, typically
+ // the number of teams.
+ ImplicitParamDecl WidthArg(C, /*DC=*/nullptr, SourceLocation(),
+ /*Id=*/nullptr, C.IntTy);
----------------
`IntTy` or `Int32Ty`?
https://reviews.llvm.org/D29879
More information about the cfe-commits
mailing list