[clang] [llvm] [NVPTX] Revamp NVVMIntrRange pass (PR #94422)
Alex MacLean via cfe-commits
cfe-commits at lists.llvm.org
Wed Jun 5 15:07:09 PDT 2024
================
@@ -0,0 +1,60 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-attributes --version 5
+; RUN: opt < %s -S -mtriple=nvptx-nvidia-cuda -mcpu=sm_20 -passes=nvvm-intr-range | FileCheck %s
+
+define i32 @test_maxntid() {
+; CHECK-LABEL: define i32 @test_maxntid(
+; CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT: [[TMP1:%.*]] = call range(i32 0, 96) i32 @llvm.nvvm.read.ptx.sreg.tid.x()
+; CHECK-NEXT: [[TMP2:%.*]] = call range(i32 0, 64) i32 @llvm.nvvm.read.ptx.sreg.tid.z()
+; CHECK-NEXT: [[TMP4:%.*]] = call range(i32 1, 97) i32 @llvm.nvvm.read.ptx.sreg.ntid.y()
+; CHECK-NEXT: [[TMP3:%.*]] = add i32 [[TMP1]], [[TMP2]]
+; CHECK-NEXT: [[TMP5:%.*]] = add i32 [[TMP3]], [[TMP4]]
+; CHECK-NEXT: ret i32 [[TMP5]]
+;
+ %1 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()
+ %2 = call i32 @llvm.nvvm.read.ptx.sreg.tid.z()
+ %3 = call i32 @llvm.nvvm.read.ptx.sreg.ntid.y()
----------------
AlexMaclean wrote:
Added all the variants.
I've removed SM logic so I'm not sure if there is anything else you'd like me to change?
https://github.com/llvm/llvm-project/pull/94422
More information about the cfe-commits
mailing list