[llvm] [NVPTX] Add TMA Bulk Copy Intrinsics (PR #138679)

Artem Belevich via llvm-commits llvm-commits at lists.llvm.org
Thu May 8 11:59:00 PDT 2025


================
@@ -2720,28 +2720,46 @@ void NVPTXDAGToDAGISel::SelectCpAsyncBulkTensorReduceCommon(SDNode *N,
   ReplaceNode(N, CurDAG->getMachineNode(Opcode, DL, N->getVTList(), Ops));
 }
 
-void NVPTXDAGToDAGISel::SelectCpAsyncBulkS2G(SDNode *N) {
+void NVPTXDAGToDAGISel::SelectCpAsyncBulkS2GCommon(SDNode *N, bool HasMask) {
----------------
Artem-B wrote:

> I can't seem to figure out if there is a way to just ignore the cache-hint operand when the final operand is 0.

You should be able to match a call with a constant argument. I think you can  do it literally.
https://github.com/llvm/llvm-project/blob/ae6e1276233ca541fdb2be1dde3074eb78277859/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td#L1672
https://github.com/llvm/llvm-project/blob/ae6e1276233ca541fdb2be1dde3074eb78277859/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td#L964

In the worst case you could use `PatLeaf` like we do for matching FP : https://github.com/llvm/llvm-project/blob/ae6e1276233ca541fdb2be1dde3074eb78277859/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td#L14



https://github.com/llvm/llvm-project/pull/138679


More information about the llvm-commits mailing list