[PATCH] [LoopVectorize]Teach Loop Vectorizer about interleaved memory access

Renato Golin renato.golin at linaro.org
Mon May 11 06:33:01 PDT 2015


================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:687
@@ +686,3 @@
+    bool IsLoad;
+    unsigned Align;
+    // To avoid breaking the dependence, the new vectorized interleaved access
----------------
Since this is a struct, there's no way to guarantee that any of these members will be used after initialization. Please modify the constructors to initialise all members with their default valures (0, 1, nullptr, etc).

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:1749
@@ +1748,3 @@
+// Which is equal to
+//     <0, VF, VF*2, ..., VF*NumVec-VF, 1, VF+1, VF*2+1, ...>
+// E.g. For 2 interleaved vectors, if VF is 4, the mask is:
----------------
Make it VF*(NumVec-1) to be more clear in:

    //     <0, VF, VF*2, ..., VF*(NumVec-1), 1, VF+1, VF*2+1, ...>

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:1798
@@ +1797,3 @@
+// If the 1st vector has more elements, extend the 2nd vector with UNDEFs.
+static Value *ConcatenateTwoVectors(IRBuilder<> &Builder, Value *V1,
+                                    Value *V2) {
----------------
I think this function could be a lot simpler if you just extended Vec2 first, if smaller, then concatenated both at the end, keeping all the asserts to make sure it's safe.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:1825
@@ +1824,3 @@
+// Concatenate number of vectors in the give list starting from Idx
+static Value *ConcatenateVectors(IRBuilder<> &Builder,
+                                 SmallVector<Value *, 4> &List,
----------------
While clever, this method is quite heavy in that it requires function calls to return a single list, so V0 and V1 below will always be constructed from a function call to return one of the elements. Since this will normally be a list of around 2/4 elements, it'll always be *too* heavy.

I imagine you did this because your ConcatenateTwoVectors adds undef to the tail of the smaller vectors because shufflevector needs them to the of the same size, but transforming this into a loop wouldn't be too hard.



================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:1833
@@ +1832,3 @@
+  unsigned PartNum = PowerOf2Floor(NumVec);
+  // When the number of vectors is not power of 2, split the list and make sure
+  // the first list has power of 2 elements.
----------------
Move the commend before the declaration of PartNum.

http://reviews.llvm.org/D9368

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the llvm-commits mailing list