[clang] [Clang] Remove 3-element vector load and store special handling (PR #104661)

Tue Sep 3 11:03:41 PDT 2024

================
@@ -45,7 +45,7 @@ void test3(packedfloat3 *p) {
   *p = (packedfloat3) { 3.2f, 2.3f, 0.1f };
 }
 // CHECK: @test3(
-// CHECK: store <4 x float> {{.*}}, align 4
+// CHECK: store <3 x float> {{.*}}, align 4
----------------
efriedma-quic wrote:

On targets with SIMD vectors (basically any modern CPU target), a 12-byte store is going to be slower than a 16-byte store, so we don't want to generate a 12-byte store.  So clang has an ABI rule that says we can generate a 16-byte store instead.

If you have some target that actually wants 12-byte stores for some reason, it needs to be a target-specific ABI rule in clang: on that target, we just directly generate 12-byte stores, instead of 16-byte stores.

https://github.com/llvm/llvm-project/pull/104661