[all-commits] [llvm/llvm-project] 735209: [NVPTX] Unify and extend barrier{.cta} intrinsic s...
Alex MacLean via All-commits
all-commits at lists.llvm.org
Wed May 21 08:14:37 PDT 2025
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 735209c0688b10a66c24750422b35d8c2ad01bb5
https://github.com/llvm/llvm-project/commit/735209c0688b10a66c24750422b35d8c2ad01bb5
Author: Alex MacLean <amaclean at nvidia.com>
Date: 2025-05-21 (Wed, 21 May 2025)
Changed paths:
M clang/lib/CodeGen/TargetBuiltins/NVPTX.cpp
M clang/test/CodeGen/builtins-nvptx-ptx60.cu
M clang/test/CodeGen/builtins-nvptx.c
M clang/test/Headers/gpuintrin.c
M llvm/docs/NVPTXUsage.rst
M llvm/include/llvm/IR/IntrinsicsNVVM.td
M llvm/lib/IR/AutoUpgrade.cpp
M llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
M llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
M llvm/lib/Transforms/IPO/AttributorAttributes.cpp
M llvm/test/Analysis/GlobalsModRef/functions_without_nosync.ll
M llvm/test/Assembler/auto_upgrade_nvvm_intrinsics.ll
M llvm/test/CodeGen/NVPTX/barrier.ll
R llvm/test/CodeGen/NVPTX/named-barriers.ll
M llvm/test/CodeGen/NVPTX/noduplicate-syncthreads.ll
M llvm/test/Feature/intrinsic-noduplicate.ll
M llvm/test/Transforms/FunctionAttrs/convergent.ll
M llvm/test/Transforms/JumpThreading/thread-two-bbs-cuda.ll
M llvm/test/Transforms/OpenMP/barrier_removal.ll
M mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
M mlir/test/Target/LLVMIR/Import/nvvmir.ll
M mlir/test/Target/LLVMIR/nvvmir.mlir
Log Message:
-----------
[NVPTX] Unify and extend barrier{.cta} intrinsic support (#140615)
Our current intrinsic support for barrier intrinsics is confusing and
incomplete, with multiple intrinsics mapping to the same instruction and
intrinsic names not clearly conveying intrinsic semantics. Further, we
lack support for some variants. This change unifies the IR
representation to a single consistently named set of intrinsics.
- llvm.nvvm.barrier.cta.sync.aligned.all(i32)
- llvm.nvvm.barrier.cta.sync.aligned(i32, i32)
- llvm.nvvm.barrier.cta.arrive.aligned(i32, i32)
- llvm.nvvm.barrier.cta.sync.all(i32)
- llvm.nvvm.barrier.cta.sync(i32, i32)
- llvm.nvvm.barrier.cta.arrive(i32, i32)
The following Auto-Upgrade rules are used to maintain compatibility with
IR using the legacy intrinsics:
* llvm.nvvm.barrier0 --> llvm.nvvm.barrier.cta.sync.aligned.all(0)
* llvm.nvvm.barrier.n --> llvm.nvvm.barrier.cta.sync.aligned.all(x)
* llvm.nvvm.bar.sync --> llvm.nvvm.barrier.cta.sync.aligned.all(x)
* llvm.nvvm.barrier --> llvm.nvvm.barrier.cta.sync.aligned(x, y)
* llvm.nvvm.barrier.sync --> llvm.nvvm.barrier.cta.sync.all(x)
* llvm.nvvm.barrier.sync.cnt --> llvm.nvvm.barrier.cta.sync(x, y)
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list