[PATCH] [InstCombine] Combine adjacent i8 loads.

Fri May 2 00:21:26 PDT 2014

On May 2, 2014, at 12:05 AM, Chandler Carruth <chandlerc at gmail.com> wrote:

> 
> On Thu, May 1, 2014 at 11:55 PM, Andrew Trick <atrick at apple.com> wrote:
> On May 1, 2014, at 11:49 PM, Chandler Carruth <chandlerc at gmail.com> wrote:
> 
>> I'm curious why not to use the def-use tree approach with the SLP vectorizer so that we can also handle bitwise operations as Raul mentioned? That seems like a nice advantage of using an integrated approach.
> 
> 
> I don’t see how the SLP algorithm can recognize these bit ops, which are actually doing a BSWAP. Arnold can probably explain better.
> 
> No, this isn't about bswap or this test case.
> 
> It's the more general point that we could view loads, stores, and bitwise operations on M iN values as "vectorizable" into loads, stores, and bitwise operations on a single iN*M value. By doing the widening in SLP, we can expose, for example, the pattern of "load each of four bytes and mask out the high bit before storing it into another array" into "vector" operation on a larger integer.

Yes, I agree with Raul that SLP could be improved to handle bitwise operations, it just isn’t going to help Michael’s case. The algorithm needs a starting point to begin searching for vectorizable ops. It starts with either stores or phis under the assumption that it wants to vectorize a whole chain. You’re probably thinking that a more general algorithm could recognize load combining as a side-effect. I’m guessing that's not worth the complexity in terms of the algorithm’s structure and profitability heuristics, but I’m not the expert so will say no more.

-Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140502/b8226cc2/attachment.html>