[clang] [llvm] [NVPTX] Revamp NVVMIntrRange pass (PR #94422)

Alex MacLean via cfe-commits cfe-commits at lists.llvm.org
Wed Jun 5 15:07:09 PDT 2024


================
@@ -0,0 +1,60 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-attributes --version 5
+; RUN: opt < %s -S -mtriple=nvptx-nvidia-cuda -mcpu=sm_20 -passes=nvvm-intr-range | FileCheck %s
+
+define i32 @test_maxntid() {
+; CHECK-LABEL: define i32 @test_maxntid(
+; CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:    [[TMP1:%.*]] = call range(i32 0, 96) i32 @llvm.nvvm.read.ptx.sreg.tid.x()
+; CHECK-NEXT:    [[TMP2:%.*]] = call range(i32 0, 64) i32 @llvm.nvvm.read.ptx.sreg.tid.z()
+; CHECK-NEXT:    [[TMP4:%.*]] = call range(i32 1, 97) i32 @llvm.nvvm.read.ptx.sreg.ntid.y()
+; CHECK-NEXT:    [[TMP3:%.*]] = add i32 [[TMP1]], [[TMP2]]
+; CHECK-NEXT:    [[TMP5:%.*]] = add i32 [[TMP3]], [[TMP4]]
+; CHECK-NEXT:    ret i32 [[TMP5]]
+;
+  %1 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()
+  %2 = call i32 @llvm.nvvm.read.ptx.sreg.tid.z()
+  %3 = call i32 @llvm.nvvm.read.ptx.sreg.ntid.y()
----------------
AlexMaclean wrote:

Added all the variants. 

I've removed SM logic so I'm not sure if there is anything else you'd like me to change?

https://github.com/llvm/llvm-project/pull/94422


More information about the cfe-commits mailing list