[llvm] [X86] Align f128 and i128 to 16 bytes when passing on x86-32 (PR #138092)

Fri Jul 11 05:29:52 PDT 2025

================
@@ -237,9 +237,18 @@ EVT X86TargetLowering::getSetCCResultType(const DataLayout &DL,
 bool X86TargetLowering::functionArgumentNeedsConsecutiveRegisters(
     Type *Ty, CallingConv::ID CallConv, bool isVarArg,
     const DataLayout &DL) const {
-  // i128 split into i64 needs to be allocated to two consecutive registers,
-  // or spilled to the stack as a whole.
-  return Ty->isIntegerTy(128);
+  // On x86-64 i128 is split into two i64s and needs to be allocated to two
+  // consecutive registers, or spilled to the stack as a whole. On x86-32 i128
+  // is split to four i32s and never actually passed in registers, but we use
+  // the consecutive register mark to match it in TableGen.
+  if (Ty->isIntegerTy(128))
+    return true;
+
+  // On x86-32, fp128 acts the same as i128.
+  if (Subtarget.is32Bit() && Ty->isFP128Ty())
+    return true;
+
+  return false;
----------------
tgross35 wrote:

This should probably also match vector types somehow because `_m64`, `__m128`, `__m256`, and `__m512` are specified to have an alignment of 8, 16, 32, and 64 bytes, respectively.

https://github.com/llvm/llvm-project/pull/138092