[PATCH] D134797: [X86][vectorcall] Make floating-type passed by value to match with MSVC
Phoebe Wang via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Wed Oct 5 07:32:03 PDT 2022
pengfei added inline comments.
================
Comment at: clang/lib/CodeGen/TargetInfo.cpp:1858-1859
}
- return getIndirectResult(Ty, /*ByVal=*/false, State);
+ bool ByVal = IsVectorCall && Ty->isFloatingType();
+ return getIndirectResult(Ty, ByVal, State);
}
----------------
rnk wrote:
> pengfei wrote:
> > rnk wrote:
> > > I would try to refactor this so that the vectorcall HFA that can't be passed in SSE regs falls through to the following logic. I suspect that it correctly handles each case that we care about:
> > > - double: direct
> > > - vector: indirect for alignment
> > > - aggregate: indirect for alignment, any HFA will presumably be aligned to more than 32bits
> > >
> > Not sure if I understand it correctly, the HFA is not a floating type, so it won't be affected. Add a test case for it.
> > MSVC passes it indirectly too. https://gcc.godbolt.org/z/3qf4dTYfv
> Thanks, I see what you mean. I thought the code for handling overaligned aggregates would trigger, passing any HFA indirectly, but it does not for plain FP HFAs. You can observe the difference by replacing `double` in HFA2 with `__int64`, and see that HFA2 is passed underaligned on the stack:
> https://gcc.godbolt.org/z/jqx4xcnjq
>
> I still think this code would benefit from separating the regcall and vectorcall cases, something like:
> bool IsInReg = State.FreeSSERegs >= NumElts;
> if (IsInReg)
> State.FreeSSERegs -= NumElts;
> if (IsRegCall) {
> // handle regcall
> if (IsInReg)
> ...
> } else {
> // handle vectorcall
> if (IsInReg)
> ...
> }
>
> They seem to have pretty different rules both when SSE regs are available and when not.
I can see the reason is non plain FP HFAs are not `HomogeneousAggregate` (possible be passed by SSE) on X86.
Anyway, I got your point now. Refactor the code seems better, thanks!
================
Comment at: clang/test/CodeGen/vectorcall.c:157
+// X86-SAME: <4 x float> inreg noundef %xmm5,
+// X86-SAME: double* noundef byval(double) align 4 %0)
#endif
----------------
rnk wrote:
> pengfei wrote:
> > rnk wrote:
> > > Why not pass the double directly? That should be ABI compatible:
> > > https://gcc.godbolt.org/z/W4rjn63b5
> > Sorry, I'm not sure what's your mean here. Do you mean I should use your example as the test case? Here the case mocked `vectorcall_indirect_vec`, which I think is intended to check if the type, `inreg` and `byval` etc. are generated correctly.
> I mean that these two LLVM prototypes are ABI compatible at the binary level for i686, but the second is much easier to optimize:
> double @byval(double* byval(double) %p) {
> %v = load double, double* %p
> ret double %v
> }
> double @direct(double %v) {
> ret double %v
> }
> https://gcc.godbolt.org/z/MjEvdEKbT
>
> Clang should generate the prototype.
I see your point now, sounds good to me, thanks!
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D134797/new/
https://reviews.llvm.org/D134797
More information about the cfe-commits
mailing list