[clang] [AMDGPU] Add clang builtin for generic AMDGPU shuffle (PR #185302)
via cfe-commits
cfe-commits at lists.llvm.org
Sun Mar 8 09:17:55 PDT 2026
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-backend-amdgpu
Author: Joseph Huber (jhuber6)
<details>
<summary>Changes</summary>
Summary:
AMDGPU introduced a high level intrinsic for shuffles. The main
advantage of this over the ds_bpermute path is that it is correctly
lowered for w32 / w64 and doesn't require the four byte offset. This PR
adds '__builtin_amdgcn_wave_shuffle' to access it.
---
Full diff: https://github.com/llvm/llvm-project/pull/185302.diff
4 Files Affected:
- (modified) clang/include/clang/Basic/BuiltinsAMDGPU.td (+1)
- (modified) clang/lib/CIR/CodeGen/CIRGenBuiltinAMDGPU.cpp (+2-1)
- (modified) clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp (+3)
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn.cl (+7)
``````````diff
diff --git a/clang/include/clang/Basic/BuiltinsAMDGPU.td b/clang/include/clang/Basic/BuiltinsAMDGPU.td
index acd0a34a79253..285533a6b8fb8 100644
--- a/clang/include/clang/Basic/BuiltinsAMDGPU.td
+++ b/clang/include/clang/Basic/BuiltinsAMDGPU.td
@@ -213,6 +213,7 @@ def __builtin_amdgcn_ds_permute : AMDGPUBuiltin<"int(int, int)", [Const]>;
def __builtin_amdgcn_ds_bpermute : AMDGPUBuiltin<"int(int, int)", [Const]>;
def __builtin_amdgcn_readfirstlane : AMDGPUBuiltin<"int(int)", [Const]>;
def __builtin_amdgcn_readlane : AMDGPUBuiltin<"int(int, int)", [Const]>;
+def __builtin_amdgcn_wave_shuffle : AMDGPUBuiltin<"int(int, int)", [Const]>;
def __builtin_amdgcn_fmed3f : AMDGPUBuiltin<"float(float, float, float)", [Const]>;
def __builtin_amdgcn_ds_faddf : AMDGPUBuiltin<"float(float address_space<3> *, float, _Constant int, _Constant int, _Constant bool)">;
def __builtin_amdgcn_ds_fminf : AMDGPUBuiltin<"float(float address_space<3> *, float, _Constant int, _Constant int, _Constant bool)">;
diff --git a/clang/lib/CIR/CodeGen/CIRGenBuiltinAMDGPU.cpp b/clang/lib/CIR/CodeGen/CIRGenBuiltinAMDGPU.cpp
index b4b0c455904fc..ffbfe669510a8 100644
--- a/clang/lib/CIR/CodeGen/CIRGenBuiltinAMDGPU.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenBuiltinAMDGPU.cpp
@@ -78,7 +78,8 @@ CIRGenFunction::emitAMDGPUBuiltinExpr(unsigned builtinId,
return mlir::Value{};
}
case AMDGPU::BI__builtin_amdgcn_readlane:
- case AMDGPU::BI__builtin_amdgcn_readfirstlane: {
+ case AMDGPU::BI__builtin_amdgcn_readfirstlane:
+ case AMDGPU::BI__builtin_amdgcn_wave_shuffle: {
cgm.errorNYI(expr->getSourceRange(),
std::string("unimplemented AMDGPU builtin call: ") +
getContext().BuiltinInfo.getName(builtinId));
diff --git a/clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp b/clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
index 72d5cb8040119..f4eaece58faa7 100644
--- a/clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+++ b/clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
@@ -554,6 +554,9 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID,
case AMDGPU::BI__builtin_amdgcn_readlane:
return emitBuiltinWithOneOverloadedType<2>(*this, E,
Intrinsic::amdgcn_readlane);
+ case AMDGPU::BI__builtin_amdgcn_wave_shuffle:
+ return emitBuiltinWithOneOverloadedType<2>(*this, E,
+ Intrinsic::amdgcn_wave_shuffle);
case AMDGPU::BI__builtin_amdgcn_readfirstlane:
return emitBuiltinWithOneOverloadedType<1>(*this, E,
Intrinsic::amdgcn_readfirstlane);
diff --git a/clang/test/CodeGenOpenCL/builtins-amdgcn.cl b/clang/test/CodeGenOpenCL/builtins-amdgcn.cl
index 376105cb6594c..ea29657aaf623 100644
--- a/clang/test/CodeGenOpenCL/builtins-amdgcn.cl
+++ b/clang/test/CodeGenOpenCL/builtins-amdgcn.cl
@@ -314,6 +314,13 @@ void test_readlane(global int* out, int a, int b)
*out = __builtin_amdgcn_readlane(a, b);
}
+// CHECK-LABEL: @test_wave_shuffle
+// CHECK: {{.*}}call{{.*}} i32 @llvm.amdgcn.wave.shuffle.i32(i32 %a, i32 %b)
+void test_wave_shuffle(global int* out, int a, int b)
+{
+ *out = __builtin_amdgcn_wave_shuffle(a, b);
+}
+
// CHECK-LABEL: @test_fcmp_f32
// CHECK: {{.*}}call{{.*}} i64 @llvm.amdgcn.fcmp.i64.f32(float %a, float %b, i32 5)
void test_fcmp_f32(global ulong* out, float a, float b)
``````````
</details>
https://github.com/llvm/llvm-project/pull/185302
More information about the cfe-commits
mailing list