[PATCH] [InstCombine] Combine adjacent i8 loads.

Chandler Carruth chandlerc at gmail.com
Sat May 3 02:21:53 PDT 2014


On Fri, May 2, 2014 at 12:00 PM, Arnold Schwaighofer <
aschwaighofer at apple.com> wrote:

> To clarify, I agree with Andy that we can run into phase ordering problems
> if we implement this as a separate pass.
>

That should largely be handled by doing this after the SLP vectorizer can
search for profitable things? My feeling was that if this is separate, it
will always be strictly less powerful / interesting than whatever the SLP
vectorizer can do. So we let the SLP vectorizer have the first shot. (This
of course doesn't solve the phase ordering problem across inlining or other
complex iterations, but those seem less worrisome.


>
> What I wanted to say above is that if we model this transformation in the
> SLP vectorizer then we should not have phase ordering problems. However, I
> don’t think modeling this in the SLP vectorizer (adding complexity) is
> justified just by swap (we should do this as a dagcombine).
>

I don't think bswap justifies much of anything FWIW. We can fix bswap in a
myriad of ways.


>  If on the other hand we expect longer chains leading to a load that could
> be vectorized then it might make sense thinking about adding complexity to
> the slp vectorizer.
>

I see a lot of really horrible code where people manually load a 32-bit or
64-bit integer, and then extract bytes, bits, or other sub-regions of it.
This code invariably has comments about how doing these contortions is
essential to getting decent performance. My motivation is to ensure that
the optimizer can and does handle these cases so that programmers can write
the more boring code and stop worrying. I suspect there are quite a few
encoding things that would benefit from *some* combining.

That doesn't mean it is worth the complexity of teaching it to the SLP
vectorizer, it just means that it seems worth *some* complexity beyond a
point fix for bswap. Based on the challenges you and others have described,
starting off with a simple and boring pass for this which still gets
exposed to instcombine and friends seems like a good initial point in the
tradeoff space.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140503/e9b634a5/attachment.html>


More information about the llvm-commits mailing list