[PATCH] D28118: AVX-512 cost calculation for interleave load/store patterns

Tue Dec 27 11:17:04 PST 2016

mkuper added inline comments.

================
Comment at: ../include/llvm/Analysis/TargetTransformInfo.h:467
+    SK_ExtractSubvector,///< ExtractSubvector Index indicates start offset.
+    SK_MergeTwoSrc,     ///< Merge two vectors into one.
+    SK_PermuteOneSrc    ///< Shuffle elements of one source vector with any
----------------
Can you make this clearer? It's not obvious what "merge" means. Does the order matter?

================
Comment at: ../include/llvm/Analysis/TargetTransformInfo.h:469
+    SK_PermuteOneSrc    ///< Shuffle elements of one source vector with any
+                        ///<  shuffle mask.
   };
----------------
Is the extra space intentional?
Also, maybe "one" -> " a single"?

================
Comment at: ../include/llvm/CodeGen/BasicTTIImpl.h:65
+  /// insert operations.
+  unsigned getAllPermutationsShuffleOverhead(Type *Ty) {
     assert(Ty->isVectorTy() && "Can only shuffle vectors");
----------------
Why "All Permutations"?
getPermuteShuffleOverhead(), maybe? Also, I'm not sure what this has to do with permutations, especially given the example below. (TBH, It didn't have anything to do with SK_Alternate, it was just used that way. That wasn't really good either).

================
Comment at: ../include/llvm/CodeGen/BasicTTIImpl.h:359
     }
     return 1;
   }
----------------
This code makes very little sense to me. Not your change, but the original code.
Why does this special-case "alt shuffle", of all things? And why should this special-case only these specific shuffles after the change?

I think this is backwards - there may be specific shuffle types that are cheap by default - e.g. broadcast makes sense. Reverse? Not so much.

================
Comment at: ../lib/Target/X86/X86TargetTransformInfo.cpp:804
+                                              ISD::VECTOR_SHUFFLE, LT.second))
+        return LT.first * Entry->Cost;
+
----------------
Are you sure this is correct? I mean, this is fine for SK_Reverse, but I don't think it works for general shuffles.

I mean, let's say you have a shuffle of two v256i8.
Legalization will give you two sets of 4 * v64i8, but you don't end up with 4 two-input shuffles, since each of the 4 output vectors may depend on any subset of the 8 input vectors, so you may need a lot more shuffles.

Am I missing something?

================
Comment at: ../lib/Target/X86/X86TargetTransformInfo.cpp:851
+                                              ISD::VECTOR_SHUFFLE, LT.second))
+        return LT.first * Entry->Cost;
+
----------------
Same as above, I don't think this works for PermuteOneSrc either.

================
Comment at: ../lib/Target/X86/X86TargetTransformInfo.cpp:886
+        return LT.first * Entry->Cost;
   }

----------------
Are you planning on adding SSE4, AVX and AVX2 costs as well?
This isn't a blocker for this patch, and should be a separate patch in any case, I'm just curious.

================
Comment at: ../lib/Target/X86/X86TargetTransformInfo.cpp:2131
+                                        LegalVT.getVectorNumElements());
+  unsigned MemOpCost = getMemoryOpCost(Opcode, SingleMemOpTy, Alignment, AddressSpace);
+
----------------
This line is > 80 chars.

Repository:
  rL LLVM

https://reviews.llvm.org/D28118