[clang] [llvm] [NVPTX] Add intrinsics for redux.sync f32 instructions (PR #126664)

Tue Feb 11 01:00:40 PST 2025

================
@@ -1,11 +1,13 @@
-// RUN: %clang_cc1 "-triple" "nvptx-nvidia-cuda" "-target-feature" "+ptx70" "-target-cpu" "sm_80" -emit-llvm -fcuda-is-device -o - %s | FileCheck %s
-// RUN: %clang_cc1 "-triple" "nvptx64-nvidia-cuda" "-target-feature" "+ptx70" "-target-cpu" "sm_80" -emit-llvm -fcuda-is-device -o - %s | FileCheck %s
+// RUN: %clang_cc1 "-triple" "nvptx-nvidia-cuda" "-target-feature" "+ptx86" "-target-cpu" "sm_100a" -emit-llvm -fcuda-is-device -o - %s | FileCheck %s
+// RUN: %clang_cc1 "-triple" "nvptx64-nvidia-cuda" "-target-feature" "+ptx86" "-target-cpu" "sm_100a" -emit-llvm -fcuda-is-device -o - %s | FileCheck %s
----------------
durga4github wrote:

Let us keep the existing file and the tests intact. We need them for ptx70/sm_80.

Can we add another redux-f32-builtins.cu file with only the new additions from this change?

https://github.com/llvm/llvm-project/pull/126664