[clang] [clang] Always pass fp128 arguments indirectly on Windows (PR #115052)
Trevor Gross via cfe-commits
cfe-commits at lists.llvm.org
Tue Nov 5 11:24:00 PST 2024
https://github.com/tgross35 created https://github.com/llvm/llvm-project/pull/115052
Clang currently passes and returns `__float128` in vector registers on MinGW targets. However, the Windows x86-64 calling convention [1] states the following:
> __m128 types, arrays, and strings are never passed by immediate value. Instead, a pointer is passed to memory allocated by the caller. Structs and unions of size 8, 16, 32, or 64 bits, and __m64 types, are passed as if they were integers of the same size. Structs or unions of other sizes are passed as a pointer to memory allocated by the caller. For these aggregate types passed as a pointer, including __m128, the caller-allocated temporary memory must be 16-byte aligned.
Based on the above it sounds like `__float128` should be passed indirectly; this is what MinGW GCC already does, so change Clang to match. Passing by value causes problems with varargs. E.g. the below completes successfully when built with GCC but has a runtime crash when built with Clang:
```c
void va_f128(int count, ...) {
va_list args;
va_start(args, count);
__float128 val = va_arg(args, __float128);
va_end(args);
}
int main() {
va_f128(0, 0.0);
}
```
This patch fixes the above. It also resolves crashes when calling GCC-built f128 libcalls.
Regarding return values, the documentation states:
> A scalar return value that can fit into 64 bits, including the __m64 type, is returned through RAX. Non-scalar types including floats, doubles, and vector types such as __m128, __m128i, __m128d are returned in XMM0.
This makes it sound like it should be acceptable to return `__float128` in XMM0. However, GCC returns `__float128` on the stack, so do the same here to be consistent.
Clang's MSVC targets do not support `__float128` or `_Float128`, but these changes would also apply there if it is eventually enabled.
[1]: https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-170
>From 432e8a66156f08d45ad691017255364cfb0fd947 Mon Sep 17 00:00:00 2001
From: Trevor Gross <tmgross at umich.edu>
Date: Tue, 5 Nov 2024 07:00:35 -0500
Subject: [PATCH 1/2] [clang] Add fp128 ABI tests for MinGW (NFC)
Duplicate `win64-i128.c` to `win64-fp128.c` and update with the current
behavior of `__float128`.
---
clang/test/CodeGen/win64-fp128.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
create mode 100644 clang/test/CodeGen/win64-fp128.c
diff --git a/clang/test/CodeGen/win64-fp128.c b/clang/test/CodeGen/win64-fp128.c
new file mode 100644
index 00000000000000..33e2441ddf3146
--- /dev/null
+++ b/clang/test/CodeGen/win64-fp128.c
@@ -0,0 +1,20 @@
+// RUN: %clang_cc1 -triple x86_64-windows-gnu -emit-llvm -o - %s \
+// RUN: | FileCheck %s --check-prefix=GNU64
+// __float128 is unsupported on MSVC
+
+__float128 fp128_ret(void) { return 0; }
+// GNU64: define dso_local fp128 @fp128_ret()
+
+__float128 fp128_args(__float128 a, __float128 b) { return a * b; }
+// GNU64: define dso_local fp128 @fp128_args(fp128 noundef %a, fp128 noundef %b)
+
+void fp128_vararg(int a, ...) {
+ // GNU64-LABEL: define dso_local void @fp128_vararg
+ __builtin_va_list ap;
+ __builtin_va_start(ap, a);
+ __float128 i = __builtin_va_arg(ap, __float128);
+ // movaps xmm0, xmmword ptr [rax]
+ // GNU64: load ptr, ptr
+ // GNU64: load fp128, ptr
+ __builtin_va_end(ap);
+}
>From bacafd9466c37109d059f8de356b697934e493a3 Mon Sep 17 00:00:00 2001
From: Trevor Gross <tmgross at umich.edu>
Date: Tue, 5 Nov 2024 05:53:10 -0500
Subject: [PATCH 2/2] [clang] Always pass `fp128` arguments indirectly on
Windows
Clang currently passes and returns `__float128` in vector registers on
MinGW targets. However, the Windows x86-64 calling convention [1] states
the following:
__m128 types, arrays, and strings are never passed by immediate
value. Instead, a pointer is passed to memory allocated by the
caller. Structs and unions of size 8, 16, 32, or 64 bits, and __m64
types, are passed as if they were integers of the same size. Structs
or unions of other sizes are passed as a pointer to memory allocated
by the caller. For these aggregate types passed as a pointer,
including __m128, the caller-allocated temporary memory must be
16-byte aligned.
Based on the above it sounds like `__float128` should be passed
indirectly; this is what MinGW GCC already does, so change Clang to
match. Passing by value causes problems with varargs. E.g. the below
completes successfully when built with GCC but has a runtime crash when
built with Clang:
void va_f128(int count, ...) {
va_list args;
va_start(args, count);
__float128 val = va_arg(args, __float128);
va_end(args);
}
int main() {
va_f128(0, 0.0);
}
This patch fixes the above. It also resolves crashes when calling
GCC-built f128 libcalls.
Regarding return values, the documentation states:
A scalar return value that can fit into 64 bits, including the __m64
type, is returned through RAX. Non-scalar types including floats,
doubles, and vector types such as __m128, __m128i, __m128d are
returned in XMM0.
This makes it sound like it should be acceptable to return `__float128`
in XMM0. However, GCC returns `__float128` on the stack, so do the same
here to be consistent.
Clang's MSVC targets do not support `__float128` or `_Float128`, but
these changes would also apply there if it is eventually enabled.
[1]: https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-170
---
clang/lib/CodeGen/Targets/X86.cpp | 5 +++++
clang/test/CodeGen/win64-fp128.c | 4 ++--
2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/clang/lib/CodeGen/Targets/X86.cpp b/clang/lib/CodeGen/Targets/X86.cpp
index 7f73bf2a65266e..16656be14d8353 100644
--- a/clang/lib/CodeGen/Targets/X86.cpp
+++ b/clang/lib/CodeGen/Targets/X86.cpp
@@ -3367,6 +3367,11 @@ ABIArgInfo WinX86_64ABIInfo::classify(QualType Ty, unsigned &FreeSSERegs,
return ABIArgInfo::getDirect(llvm::FixedVectorType::get(
llvm::Type::getInt64Ty(getVMContext()), 2));
+ case BuiltinType::Float128:
+ // f128 is too large to fit in integer registers so the Windows ABI
+ // require it be passed on the stack. GCC does the same.
+ return ABIArgInfo::getIndirect(Align, /*ByVal=*/false);
+
default:
break;
}
diff --git a/clang/test/CodeGen/win64-fp128.c b/clang/test/CodeGen/win64-fp128.c
index 33e2441ddf3146..bfb903709397c3 100644
--- a/clang/test/CodeGen/win64-fp128.c
+++ b/clang/test/CodeGen/win64-fp128.c
@@ -3,10 +3,10 @@
// __float128 is unsupported on MSVC
__float128 fp128_ret(void) { return 0; }
-// GNU64: define dso_local fp128 @fp128_ret()
+// GNU64: define dso_local void @fp128_ret(ptr dead_on_unwind noalias writable sret(fp128) align 16 %agg.result)
__float128 fp128_args(__float128 a, __float128 b) { return a * b; }
-// GNU64: define dso_local fp128 @fp128_args(fp128 noundef %a, fp128 noundef %b)
+// GNU64: define dso_local void @fp128_args(ptr dead_on_unwind noalias writable sret(fp128) align 16 %agg.result, ptr noundef %0, ptr noundef %1)
void fp128_vararg(int a, ...) {
// GNU64-LABEL: define dso_local void @fp128_vararg
More information about the cfe-commits
mailing list