[all-commits] [llvm/llvm-project] fa7f0e: [NVPTX] Add Bulk Copy Prefetch Intrinsics (#123226)
Abhilash Majumder via All-commits
all-commits at lists.llvm.org
Thu Jan 23 03:20:06 PST 2025
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: fa7f0e582bc25a91d89dab7c488a1619060f9bef
https://github.com/llvm/llvm-project/commit/fa7f0e582bc25a91d89dab7c488a1619060f9bef
Author: Abhilash Majumder <30946547+abhilash1910 at users.noreply.github.com>
Date: 2025-01-23 (Thu, 23 Jan 2025)
Changed paths:
M llvm/docs/NVPTXUsage.rst
M llvm/include/llvm/IR/IntrinsicsNVVM.td
M llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp
M llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.h
M llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
M llvm/test/CodeGen/NVPTX/cp-async-bulk.ll
Log Message:
-----------
[NVPTX] Add Bulk Copy Prefetch Intrinsics (#123226)
This patch adds NVVM intrinsics and NVPTX codegen for:
- cp.async.bulk.prefetch.L2.* variants
- These intrinsics optionally support cache_hints as indicated by the
boolean flag argument.
- Lit tests are added for all combinations of these intrinsics in
cp-async-bulk.ll.
- The generated PTX is verified with a 12.3 ptxas executable.
- Added docs for these intrinsics in NVPTXUsage.rst file.
PTX Spec reference:
https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cp-async-bulk-prefetch
Co-authored-by: abmajumder <abmajumder at nvidia.com>
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list