[PATCH] D27392: Vectorcall Calling Convention - Adding CodeGen Complete Support

Reid Kleckner via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Dec 13 09:47:13 PST 2016

rnk added inline comments.

Comment at: include/llvm/CodeGen/CallingConvLower.h:326
+  /// A shadow allocated register is a register that was allcoated
+  /// but wasn't added to the location list (Locs).
typo on "allocated"

Comment at: lib/CodeGen/CallingConvLower.cpp:74
+  for (unsigned i = 0, e = Locs.size(); i != e; ++i) {
+    for (MCRegAliasIterator AI(Locs[i].getLocReg(), &TRI, true); AI.isValid();
majnemer wrote:
> Range-based for loop?
This creates a copy of a CCValAssign, which is unnecessary.

Comment at: lib/Target/X86/X86CallingConv.cpp:130
+  // "A vector type is either a floating-point type庸or example,
+  //  a float or double熔r an SIMD vector type庸or example, __m128 or __m256."
+  if (!ValVT.isFloatingPoint() &&
rnk wrote:
> mojibake
I still see "type庸or example" with an Asian character in there.

Comment at: lib/Target/X86/X86CallingConv.td:631
-  // 256-bit vectors use YMM registers.
-  CCIfType<[v32i8, v16i16, v8i32, v4i64, v8f32, v4f64],
oren_ben_simhon wrote:
> rnk wrote:
> > Ouch. I locally confirmed this is correct, but why design a new calling convention that doesn't handle the latest vector types... =P
> If you refer to 512 bit vector types, Microsoft probably forgot to mention that it should be assigned to ZMM [[ https://msdn.microsoft.com/en-us/library/dn375768.aspx | here ]].
Actually, I think I just misunderstood. It looks like you allocate things to AVX registers elsewhere. I thought with this change you made it so that AVX registers would never be used with vectorcall.

Comment at: lib/Target/X86/X86ISelLowering.cpp:2799
+    SmallVector<ISD::InputArg, 16> InsSec;
+    for (auto InArg : Ins) {
+      ISD::InputArg InArgSec(InArg);
Unnecessary copy

Comment at: lib/Target/X86/X86ISelLowering.cpp:2810
+  // input arguments.
+  std::stable_sort(ArgLocs.begin(), ArgLocs.end(),
+                   [](const CCValAssign &A, const CCValAssign &B) -> bool {
The stable_sort is only needed if the CC was vectorcall.

I think the code would actually be clearer if we used std::merge. After the first pass, ArgLocs should be all the non-HVA argument locations sorted by argument number. The second pass appends additional sorted HVA argument locations. Then you merge those two sorted lists. You'll need a temporary ArgLocs vector to make this work.

Please factor out this two pass vectorcall code into a template function that operates on a SmallVectorImpl<T>, where T can be InputArg or OutputArg.

Comment at: test/CodeGen/X86/vectorcall.ll:71
-; tablegen any other way.
-define x86_vectorcallcc {double, double, double, double, double} @test_fp_4() {
-  ret {double, double, double, double, double}
oren_ben_simhon wrote:
> aaboud wrote:
> > Do you really want to remove this test? Cannot you just fix it?
> AFAIK, This test checks something that cannot happen.
> There is no case in which a function will return {double, double, double, double, double}.
> If it wanted to return such a structure it will be returned by pointer as the first argument to the function:
> > define x86_vectorcallcc void @test_fp_4(%struct.five_doubles* sret %a)
This was really just a whitebox test to show that LLVM does not crash when you return 5 doubles. We should keep the test if we don't fatal error. Any behavior is fine. For example, it might trigger LLVM's logic to demote return by value to sret.



More information about the llvm-commits mailing list