[PATCH] Disable load/store vectorization for types with padding bytes

Daisuke Takahashi dtakahashi42 at gmail.com
Mon Apr 22 20:39:43 PDT 2013


Hello,

Here is a patch that disables memory-instruction vectorization for types that need padding bytes, e.g., x86_fp80 has 10 bytes store size with 6 bytes padding in darwin on x86_64. Because the load/store vectorization is performed by the bit casting to a packed vector, which has incompatible memory layout due to the lack of padding bytes, the present vectorizer produces inconsistent result for memory instructions of those types. The issue has been already reported with a test case [1].

This patch checks an equality of the AllocSize of a scalar type and allocated size for each vector element, to ensure that there is no padding bytes and the array can be read/written using vector operations.

Following the comment from Nadav Rotem, I changed both the vectorizer (InnerLoopVectorizer::vectorizeMemoryInstruction) and the cost model (LoopVectorizationCostModel::getInstructionCost).

Thank you very much.

[1] http://llvm.org/bugs/show_bug.cgi?id=15758

Daisuke

--- lib/Transforms/Vectorize/LoopVectorize.cpp	(revision 180000)
+++ lib/Transforms/Vectorize/LoopVectorize.cpp	(working copy)
@@ -951,6 +951,12 @@
   Value *Ptr = LI ? LI->getPointerOperand() : SI->getPointerOperand();
   unsigned Alignment = LI ? LI->getAlignment() : SI->getAlignment();
 
+  unsigned ScalarAllocatedSize = DL->getTypeAllocSize(ScalarDataTy);
+  unsigned VectorElementSize = DL->getTypeStoreSize(DataTy)/VF;
+
+  if (ScalarAllocatedSize != VectorElementSize)
+    return scalarizeInstruction(Instr);
+
   // If the pointer is loop invariant or if it is non consecutive,
   // scalarize the load.
   int Stride = Legal->isConsecutivePtr(Ptr);
@@ -3551,7 +3557,9 @@
     // Scalarized loads/stores.
     int Stride = Legal->isConsecutivePtr(Ptr);
     bool Reverse = Stride < 0;
-    if (0 == Stride) {
+    unsigned ScalarAllocatedSize = DL->getTypeAllocSize(ValTy);
+    unsigned VectorElementSize = DL->getTypeStoreSize(VectorTy)/VF;
+    if (0 == Stride || ScalarAllocatedSize != VectorElementSize) {
       unsigned Cost = 0;
       // The cost of extracting from the value vector and pointer vector.
       Type *PtrTy = ToVectorTy(Ptr->getType(), VF);






More information about the llvm-commits mailing list