[PATCH] [InstCombine] Combine adjacent i8 loads.

Fri May 2 01:10:20 PDT 2014

On Fri, May 2, 2014 at 12:21 AM, Andrew Trick <atrick at apple.com> wrote:

> Yes, I agree with Raul that SLP could be improved to handle bitwise
> operations, it just isn’t going to help Michael’s case. The algorithm needs
> a starting point to begin searching for vectorizable ops. It starts with
> either stores or phis under the assumption that it wants to vectorize a
> whole chain.

OK, that makes sense.

However, now that I think about it, bswap *is* easily modeled in the
"bitvector" space -- it's just a shuffle. So it might be possible to even
recognize byteswap trees starting with the stores, going through a shuffle,
and then the loads.

> You’re probably thinking that a more general algorithm could recognize
> load combining as a side-effect. I’m guessing that's not worth the
> complexity in terms of the algorithm’s structure and profitability
> heuristics, but I’m not the expert so will say no more.

Yea, nor am I an expert, so I'll also stop speculating. It at least seems
worth investigating how much complexity would be required to model bit
vectors in the SLP pass. Arnold and Michael can probably do that and then
make the call on whether to do that or just do an explicit pass downstream
from there.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140502/81bb4929/attachment.html>