[PATCH] D114536: [X86][MS] Fix the wrong alignment of vector variable arguments on Win32

Wed Nov 24 13:50:20 PST 2021

rnk added inline comments.

================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:3464
+    SelectionDAG &DAG, const CCValAssign &VA, MachineFrameInfo &MFI, unsigned i,
+    bool IsVarArgWin32) const {
   // Create the nodes corresponding to a load from this parameter slot.
----------------
I would prefer to see if we can avoid changing the prototype here. The target checks (is win32) can be calculated internally by looking at the subtarget. The only thing that varies per call site is the `IsVarArg` part. If we have to change the prototype, please just pass `IsVarArg`.

================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:4085
+                                            ISD::ArgFlagsTy Flags, bool isByVal,
+                                            bool IsVarArgWin32) const {
   unsigned LocMemOffset = VA.getLocMemOffset();
----------------
Ditto. In this case, we are accumulating consecutive boolean parameters which can reduce readability as well. An alternative solution would be nicer.

================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:4095
+  if (IsVarArgWin32 && Arg.getSimpleValueType().isVector())
+    Alignment = MaybeAlign(4);
   return DAG.getStore(
----------------
Surely there are other kinds of parameters that are clamped to 4 byte alignment on win32. How are doubles handled? Can we handle them the same way?

================
Comment at: llvm/test/CodeGen/X86/vaargs-win32.ll:41
 ; MSVC-NEXT:    pushl %eax
-; MSVC-NEXT:    movaps 8(%esp), %xmm0
+; MSVC-NEXT:    movups 8(%esp), %xmm0
 ; MSVC-NEXT:    movups 24(%esp), %xmm1
----------------
As I understand it, clang will not generate this IR. It will either mark the vector with `inreg`, or it will pass it indirectly (`<4 x float>*`).

================
Comment at: llvm/test/CodeGen/X86/vaargs-win32.ll:97
 ; MINGW-NEXT:    retl
   %1 = tail call <4 x i32> (<4 x float>, ...) @foo(<4 x float> <float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00>, <4 x float> <float 5.000000e+00, float 6.000000e+00, float 7.000000e+00, float 8.000000e+00>)
   ret <4 x i32> %1
----------------
I would like to see a test case where we set up a call that has intentionally misaligned parameters, so a C prototype that looks like `void f(v4f, int, v4f, ...)`. This really underscores the need to use `movups`, because the ABI requires the data to be unaligned.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114536/new/

https://reviews.llvm.org/D114536