[PATCH] D27392: Vectorcall Calling Convention - Adding CodeGen Complete Support
Reid Kleckner via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Dec 13 09:47:13 PST 2016
rnk added inline comments.
================
Comment at: include/llvm/CodeGen/CallingConvLower.h:326
+ /// A shadow allocated register is a register that was allcoated
+ /// but wasn't added to the location list (Locs).
----------------
typo on "allocated"
================
Comment at: lib/CodeGen/CallingConvLower.cpp:74
+
+ for (unsigned i = 0, e = Locs.size(); i != e; ++i) {
+ for (MCRegAliasIterator AI(Locs[i].getLocReg(), &TRI, true); AI.isValid();
----------------
majnemer wrote:
> Range-based for loop?
This creates a copy of a CCValAssign, which is unnecessary.
================
Comment at: lib/Target/X86/X86CallingConv.cpp:130
+ // "A vector type is either a floating-point type庸or example,
+ // a float or double熔r an SIMD vector type庸or example, __m128 or __m256."
+ if (!ValVT.isFloatingPoint() &&
----------------
rnk wrote:
> mojibake
I still see "type庸or example" with an Asian character in there.
================
Comment at: lib/Target/X86/X86CallingConv.td:631
-
- // 256-bit vectors use YMM registers.
- CCIfType<[v32i8, v16i16, v8i32, v4i64, v8f32, v4f64],
----------------
oren_ben_simhon wrote:
> rnk wrote:
> > Ouch. I locally confirmed this is correct, but why design a new calling convention that doesn't handle the latest vector types... =P
> If you refer to 512 bit vector types, Microsoft probably forgot to mention that it should be assigned to ZMM [[ https://msdn.microsoft.com/en-us/library/dn375768.aspx | here ]].
Actually, I think I just misunderstood. It looks like you allocate things to AVX registers elsewhere. I thought with this change you made it so that AVX registers would never be used with vectorcall.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:2799
+ SmallVector<ISD::InputArg, 16> InsSec;
+ for (auto InArg : Ins) {
+ ISD::InputArg InArgSec(InArg);
----------------
Unnecessary copy
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:2810
+ // input arguments.
+ std::stable_sort(ArgLocs.begin(), ArgLocs.end(),
+ [](const CCValAssign &A, const CCValAssign &B) -> bool {
----------------
The stable_sort is only needed if the CC was vectorcall.
I think the code would actually be clearer if we used std::merge. After the first pass, ArgLocs should be all the non-HVA argument locations sorted by argument number. The second pass appends additional sorted HVA argument locations. Then you merge those two sorted lists. You'll need a temporary ArgLocs vector to make this work.
Please factor out this two pass vectorcall code into a template function that operates on a SmallVectorImpl<T>, where T can be InputArg or OutputArg.
================
Comment at: test/CodeGen/X86/vectorcall.ll:71
-; tablegen any other way.
-define x86_vectorcallcc {double, double, double, double, double} @test_fp_4() {
- ret {double, double, double, double, double}
----------------
oren_ben_simhon wrote:
> aaboud wrote:
> > Do you really want to remove this test? Cannot you just fix it?
> AFAIK, This test checks something that cannot happen.
> There is no case in which a function will return {double, double, double, double, double}.
> If it wanted to return such a structure it will be returned by pointer as the first argument to the function:
>
> > define x86_vectorcallcc void @test_fp_4(%struct.five_doubles* sret %a)
>
>
This was really just a whitebox test to show that LLVM does not crash when you return 5 doubles. We should keep the test if we don't fatal error. Any behavior is fine. For example, it might trigger LLVM's logic to demote return by value to sret.
Repository:
rL LLVM
https://reviews.llvm.org/D27392
More information about the llvm-commits
mailing list