[llvm] [NVPTX] Add TMA Bulk Copy intrinsics (PR #122344)
Durgadoss R via llvm-commits
llvm-commits at lists.llvm.org
Fri Jan 10 01:47:38 PST 2025
================
@@ -3024,13 +3024,90 @@ void NVPTXDAGToDAGISel::SelectCpAsyncBulkTensorReduceCommon(SDNode *N,
ReplaceNode(N, CurDAG->getMachineNode(Opcode, DL, N->getVTList(), Ops));
}
+void NVPTXDAGToDAGISel::SelectCpAsyncBulkS2G(SDNode *N) {
+ // We have {Chain, Intrinsic-ID} followed by the actual intrisic args:
+ // dst, src, size, cache_hint, cache_hint_flag
+ // NumOperands = {Chain, IID} + {Actual intrinsic args}
+ // = {2} + {5}
+ size_t NumOps = N->getNumOperands();
+ bool IsCacheHint = N->getConstantOperandVal(NumOps - 1) == 1;
+ size_t NumArgs = IsCacheHint ? 4 : 3; // src, dst, size, cache_hint
+
+ SDLoc DL(N);
+ SmallVector<SDValue, 8> Ops(N->ops().slice(2, NumArgs));
+ Ops.push_back(N->getOperand(0)); // Chain operand
+
+ unsigned Opcode;
+ bool IsShared32 =
+ CurDAG->getDataLayout().getPointerSizeInBits(ADDRESS_SPACE_SHARED) == 32;
+ if (IsCacheHint) {
----------------
durga4github wrote:
Resolving this
https://github.com/llvm/llvm-project/pull/122344
More information about the llvm-commits
mailing list