[PATCH] D22792: VecClone Pass

Tue Oct 31 12:26:21 PDT 2017

mmasten added inline comments.

================
Comment at: lib/Target/X86/X86TargetTransformInfo.cpp:2289
+char X86TTIImpl::encodeISAClass(TTI::ISAClass IsaClass) const {
+  switch (IsaClass) {
+  case TTI::XMM:
----------------
fpetrogalli wrote:
> Section 2.6 "Vector Function Name Mangling" of [1]  maps the IsaClass to other values, "x', 'y'.  Are you basing this on [2] for gcc compatibility?
> Why do [1] and [2] differ?
> 
> Francesco
> 
> [1] https://software.intel.com/sites/default/files/managed/b4/c8/Intel-Vector-Function-ABI.pdf
> [2] https://sourceware.org/glibc/wiki/libmvec?action=AttachFile&do=view&target=VectorABI.txt
Currently, the implementation is based on gcc compatibility. However, it would be nice to extend to support both gcc and icc. The IsaClass values are different because of the calling convention differences. Intel icc uses regcall calling convention and gcc uses standard calling convention. I've added references to both ABI documents.

================
Comment at: lib/Transforms/Utils/VecClone.cpp:38
+/// float dowork(float *a, int k) {
+///   a[k] = sinf(a[k]) + 9.8f;
+/// }
----------------
hfinkel wrote:
> This function doesn't return anything, and I think that makes the example hard to understand.
I agree, this was a bad bug with the example. This has been fixed and updated with an example that demonstrates the behavior of all three types of parameters (uniform, linear, and vector). Please let me know if the bitcasts shown in the example are confusing. The purpose of the bitcasts is so that the loop can appear in scalar form to the loop vectorizer. i.e., we can use a .scalar gep and index to reference vectors in the loop.

================
Comment at: lib/Transforms/Utils/VecClone.cpp:41
+///
+/// attributes #0 = { nounwind uwtable "_ZGVbM4ul_", "ZGVbN4ul_", ... }
+///
----------------
fpetrogalli wrote:
> Hello Matt, thank you for working on this.
> 
> <nitpicking>Should the comment be updated with the `"vector-variants"="_ZGVbM4ul_, ZGVbN4ul_"` syntax? </nitpicking>
> 
> Cheers,
> 
> Francesco
Thanks Francesco. Fixed.

================
Comment at: lib/Transforms/Utils/VecClone.cpp:99
+//
+// When Mem2Reg has not been run (i.e., parameters have not been registerized),
+// we end up with an alloca, store to alloca, and load from alloca sequence.
----------------
hfinkel wrote:
> I don't understand what's going on here. Is it not possible to write the transformation such that it's semantically correct regardless of whether we've run mem2reg? This is not always an either/or situation.
Perhaps the comments were written very clearly. The pass has been written to handle parameters regardless of whether or not mem2reg has run. Hopefully, the updated comments are better.

https://reviews.llvm.org/D22792