[llvm-commits] PATCH: A new SROA implementation

Tue Aug 21 22:46:49 PDT 2012

Ok, I think this patch is pretty close to being ready to go into the tree,
modulo any detailed code review folks are willing to do. Just send me
comments if you have them.

I've run it through the nightly test suite now as well as a bootstrap, and
everything looks better than clear, it looks great. ;]

I'm seeing some pretty significant execution performance improvements. One
test is 27% faster, several range from 2% to 6% faster. Only a couple
really slow down. I haven't dug in to the performance too much as currently
we're still running the old SROA in the CGSCC passes.

I've still got some code cleanup, minor refactorings I'd like to do, and
the SSAUpdater work, but I'm trying to not grow the patch on people.

-Chandler

On Tue, Aug 21, 2012 at 6:26 AM, Chandler Carruth <chandlerc at gmail.com>wrote:

> Thanks for the comments so far. Updated patch attached. Not everything is
> addressed, in particular some code cleanups and style fixes haven't yet
> happened. However, this removes the dynamic GEP handling, adds aggressive
> vector-typed rewriting and starts to handle lifetime intrinsics. Still no
> debug intrinsic handling.
>
>
> On Tue, Aug 21, 2012 at 1:52 AM, Eli Friedman <eli.friedman at gmail.com>wrote:
>
>> On Tue, Aug 21, 2012 at 12:57 AM, Chandler Carruth <chandlerc at gmail.com>
>> wrote:
>> > I'm inclined to completely remove this transform. It seems fundamentally
>> > unsafe given the current spec of GEPs, and provides fairly limited
>> benefit
>> > if the frontend consistently lowers to insertelement and extractelement
>> > (which it did in most of my experiments). If we want to support this
>> > transform, I think we need to either extend GEP to have an optional
>> > constraint on inbounds indexing within array and/or vector aggregates,
>> or we
>> > need to change the semantics of the existing 'inbounds' keyword to mean
>> > that.
>>
>> That sounds right.
>>
>> >>
>> >>
>> >> I'm sort of worried about allowing SROA so much freedom in terms of
>> >> splitting memset/memcpy; your algorithm can turn a single memcpy into
>> >> an arbitrary number of memcpys (which is a quadratic explosion in
>> >> codesize in the worst case).  Not sure if it's an issue in practice.
>> >
>> >
>> > Yes, this is the primary concern with the new algorithm. That said, the
>> N*M
>> > which explodes requires N overlapping splittable operations which
>> overlap
>> > with M disjoint un-splittable operations. My hope is that this latice
>> > structure is very rare. Even better, whenever we start heavily
>> splitting,
>> > we're getting something out of it -- we're successfully isolating
>> > un-splittable partitions which should form candidates for promotion.
>> >
>> > Do you have any ideas for mitigating or bounding the growth that don't
>> > devolve to heuristics?
>>
>> Hmm... I can't think of anything.  As far as I can tell, it's
>> basically a tradeoff between codesize and the number of memory
>> accesses, and there isn't any clear place to draw the line.
>>
>> -Eli
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120821/de0fe961/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sroa-rewrite.patch
Type: application/octet-stream
Size: 127132 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120821/de0fe961/attachment.obj>