[clang] [Clang] Remove 3-element vector load and store special handling (PR #104661)
Eli Friedman via cfe-commits
cfe-commits at lists.llvm.org
Tue Sep 3 11:03:41 PDT 2024
================
@@ -45,7 +45,7 @@ void test3(packedfloat3 *p) {
*p = (packedfloat3) { 3.2f, 2.3f, 0.1f };
}
// CHECK: @test3(
-// CHECK: store <4 x float> {{.*}}, align 4
+// CHECK: store <3 x float> {{.*}}, align 4
----------------
efriedma-quic wrote:
On targets with SIMD vectors (basically any modern CPU target), a 12-byte store is going to be slower than a 16-byte store, so we don't want to generate a 12-byte store. So clang has an ABI rule that says we can generate a 16-byte store instead.
If you have some target that actually wants 12-byte stores for some reason, it needs to be a target-specific ABI rule in clang: on that target, we just directly generate 12-byte stores, instead of 16-byte stores.
https://github.com/llvm/llvm-project/pull/104661
More information about the cfe-commits
mailing list