[clang] [libc] [llvm] [libc] Implement (v|f)printf on the GPU (PR #96369)
Jon Chesterfield via cfe-commits
cfe-commits at lists.llvm.org
Mon Jul 1 05:57:22 PDT 2024
================
@@ -0,0 +1,77 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5
+// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -emit-llvm -o - %s | FileCheck %s
+
+extern void varargs_simple(int, ...);
+
+// CHECK-LABEL: define dso_local void @foo(
+// CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT: [[ENTRY:.*:]]
+// CHECK-NEXT: [[C:%.*]] = alloca i8, align 1
+// CHECK-NEXT: [[S:%.*]] = alloca i16, align 2
+// CHECK-NEXT: [[I:%.*]] = alloca i32, align 4
+// CHECK-NEXT: [[L:%.*]] = alloca i64, align 8
+// CHECK-NEXT: [[F:%.*]] = alloca float, align 4
+// CHECK-NEXT: [[D:%.*]] = alloca double, align 8
+// CHECK-NEXT: [[A:%.*]] = alloca [[STRUCT_ANON:%.*]], align 4
+// CHECK-NEXT: [[V:%.*]] = alloca <4 x i32>, align 16
+// CHECK-NEXT: store i8 1, ptr [[C]], align 1
+// CHECK-NEXT: store i16 1, ptr [[S]], align 2
+// CHECK-NEXT: store i32 1, ptr [[I]], align 4
+// CHECK-NEXT: store i64 1, ptr [[L]], align 8
+// CHECK-NEXT: store float 1.000000e+00, ptr [[F]], align 4
+// CHECK-NEXT: store double 1.000000e+00, ptr [[D]], align 8
+// CHECK-NEXT: [[TMP0:%.*]] = load i8, ptr [[C]], align 1
+// CHECK-NEXT: [[CONV:%.*]] = sext i8 [[TMP0]] to i32
----------------
JonChesterfield wrote:
C promotes them to i32. C has a lot of rules around vararg type promotion that have not aged brilliantly.
If you want a i8 or i16, put it in a struct. C doesn't say anything about promoting that and amdgpu will pass it inlined into the struct, i.e. with no overhead. I believe nvptx will do likewise.
https://github.com/llvm/llvm-project/pull/96369
More information about the cfe-commits
mailing list