[PATCH] [SLPVectorizer] Reorder operands of shufflevector if it can result in a vectorized code.

Michael Zolotukhin mzolotukhin at apple.com
Thu Jan 8 22:46:05 PST 2015


Hi Karthik,

Thanks for the answer, I agree with you.

Please also see a comment from me inline.

Thanks for working on this!


REPOSITORY
  rL LLVM

================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:442-444
@@ -441,2 +441,5 @@
 
+  void reorderAltShuffleOperands(ArrayRef<Value *> VL,
+                                 SmallVectorImpl<Value *> &Left,
+                                 SmallVectorImpl<Value *> &Right);
   /// \brief Perform LICM and CSE on the newly generated gather sequences.
----------------
I don't think `reorderInputsAccordingToOpcode` currently handle it. I.e. it can accidentally handle it in some cases, but it doesn't do that always. For example the following code doesn't get vectorized:

```
define void @foo() #0 {
  %1 = load i32* getelementptr inbounds ([1000 x i32]* @a, i32 0, i64 0), align 4
  %2 = load i32* getelementptr inbounds ([1000 x i32]* @b, i32 0, i64 0), align 4
  %3 = add nsw i32 %1, %2
  store i32 %3, i32* getelementptr inbounds ([1000 x i32]* @c, i32 0, i64 0), align 4
  %4 = load i32* getelementptr inbounds ([1000 x i32]* @a, i32 0, i64 1), align 4
  %5 = load i32* getelementptr inbounds ([1000 x i32]* @b, i32 0, i64 1), align 4  

  ; Please note that %4 and %5 are swapped in the following line:
  %6 = add nsw i32 %5, %4

  store i32 %6, i32* getelementptr inbounds ([1000 x i32]* @c, i32 0, i64 1), align 4
  %7 = load i32* getelementptr inbounds ([1000 x i32]* @a, i32 0, i64 2), align 4
  %8 = load i32* getelementptr inbounds ([1000 x i32]* @b, i32 0, i64 2), align 4
  %9 = add nsw i32 %7, %8
  store i32 %9, i32* getelementptr inbounds ([1000 x i32]* @c, i32 0, i64 2), align 4
  %10 = load i32* getelementptr inbounds ([1000 x i32]* @a, i32 0, i64 3), align 4
  %11 = load i32* getelementptr inbounds ([1000 x i32]* @b, i32 0, i64 3), align 4
  %12 = add nsw i32 %10, %11
  store i32 %12, i32* getelementptr inbounds ([1000 x i32]* @c, i32 0, i64 3), align 4
  ret void
}
```
It might make sense to handle such cases explicitly, like you do for altShuffles.

http://reviews.llvm.org/D6677

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the llvm-commits mailing list