[PATCH] D48725: [SLP] Vectorize bit-parallel operations with SWAR.

Mon Jul 2 08:41:56 PDT 2018

courbet added a comment.

In https://reviews.llvm.org/D48725#1149214, @courbet wrote:

> Thank you all for your comments.
>
> So let me sum up the options from the various comments here:
>  A - Keep this change in the SLP vectorizer. This requires emitting GPR operations instead of vector operations, and updating the cost model.

I've updated the change with a crude implementation.

Shuffles and extracts are disabled because we can no longer rely on the
DAG to transform shuffles and extracts into the appropriate operations.
If we want to support them, we will have to reimplement them as integer
operations.

So now that I've done this I think I understand what @efriedma was saying: another take on it is to say that (taking X86 as an example), 128 is not the smallest vector, because we can do partial load/stores.

> B - Do this in DAG, inside or near to LoadCombine.

Actually I don't think the current case can be handled in the same way as MatchLoadCombine: in the case the MatchLoadCombine, the "or" instruction provides  a way to link the stores together. In the case of two completely independant load/stores //without anything in the middle// as the `two_i32` test, there is nothing linking the instructions can could provide an entry point to try to merge the instructions.

Repository:
  rL LLVM

https://reviews.llvm.org/D48725