[PATCH] D46126: [SLP] Vectorize transposable binary operand bundles

Tue May 1 11:12:41 PDT 2018

mssimpso added inline comments.

================
Comment at: test/Transforms/SLPVectorizer/AArch64/transpose.ll:93
   %tmp2.1 = add i32 %tmp1.0, %tmp1.1
   %tmp2.2 = add i32 %tmp0.2, %tmp0.3
   %tmp2.3 = add i32 %tmp1.2, %tmp1.3
----------------
Ayal wrote:
> Indeed this calls for shuffling with <0,4,2,6> and <1,5,3,7> masks; but is this pattern more natural to expect than
> ```
>   %tmp2.1 = add i32 %tmp0.2, %tmp0.3
>   %tmp2.2 = add i32 %tmp1.0, %tmp1.1
> ```
> 
> Perhaps in some complex numbers context(?)
Ah, I see your point now. I think the current patch will actually produce incorrect code if you make the above change to the test, since we don't actually enforce the "interleaved order". We'll still use the <0,4,2,6> and <1,5,3,7> masks instead of the stride-2 masks. So I think we should probably record what the mask should be when we do the re-bundling to allow for the various possibilities. It makes sense that we would have to do this in hindsight.

We can also probably get rid of the MinSize restriction at the same time.

I'd also want to test the mask against the known shuffle kinds for the cost calculation to ensure we are computing the most appropriate cost for the target. I'm actually surprised we don't already have something like `TTI->getShuffleKind(ArrayRef<int> Mask)`. Perhaps I'll work on that first.

Repository:
  rL LLVM

https://reviews.llvm.org/D46126