[llvm] [NVPTX] Add TMA Bulk Copy Intrinsics (PR #138679)
Artem Belevich via llvm-commits
llvm-commits at lists.llvm.org
Thu May 8 11:59:00 PDT 2025
================
@@ -2720,28 +2720,46 @@ void NVPTXDAGToDAGISel::SelectCpAsyncBulkTensorReduceCommon(SDNode *N,
ReplaceNode(N, CurDAG->getMachineNode(Opcode, DL, N->getVTList(), Ops));
}
-void NVPTXDAGToDAGISel::SelectCpAsyncBulkS2G(SDNode *N) {
+void NVPTXDAGToDAGISel::SelectCpAsyncBulkS2GCommon(SDNode *N, bool HasMask) {
----------------
Artem-B wrote:
> I can't seem to figure out if there is a way to just ignore the cache-hint operand when the final operand is 0.
You should be able to match a call with a constant argument. I think you can do it literally.
https://github.com/llvm/llvm-project/blob/ae6e1276233ca541fdb2be1dde3074eb78277859/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td#L1672
https://github.com/llvm/llvm-project/blob/ae6e1276233ca541fdb2be1dde3074eb78277859/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td#L964
In the worst case you could use `PatLeaf` like we do for matching FP : https://github.com/llvm/llvm-project/blob/ae6e1276233ca541fdb2be1dde3074eb78277859/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td#L14
https://github.com/llvm/llvm-project/pull/138679
More information about the llvm-commits
mailing list