[PATCH] Improvements to SSA construction

Mon Apr 6 14:04:09 PDT 2015

Sorry, I clearly meant SmallSetVector (which uses SmallVector and
SmallPtrSet).

Also, I was not supposing you would touch the blocks thing at all, nor
thinking this would remove any non-determinism. It was just that if you're
pushing onto a vector only when insertion into a set vector succeeds, it
might make sense. There are still plenty of cases where it doesn't, either
due to type mis-matches, or because you want to keep growing the set while
shrinking and re-using space in the vector.

On Mon, Apr 6, 2015 at 2:02 PM Cameron Zwarich <zwarich at apple.com> wrote:

> Yeah, that’s a bigger problem. My original complaints 2) and 3) are
> invalid because those are template parameters that could presumedly be
> swapped out for SmallVector / SmallPtrSet anyways.
>
> Cameron
>
> > On Apr 6, 2015, at 2:00 PM, Daniel Berlin <dberlin at dberlin.org> wrote:
> >
> > Unless i'm missing something, you can't use setvector anyway, because
> > it won't remove the non-determinism the sort is there to fix.
> >
> > That non-determinism comes from the fact that defblocks is a
> > smallptrset, and even if the priority queue's order is stable, the
> > only guarantee you get is that things at the same dom level end up
> > around the same place in the priority queue (but not in a specific
> > order).
> >
> >
> > If domtreenodepair was domtreenodetriple (<BasicBlock, DomLevel,
> > BBNumbers[BB]>), and sorted by second then third part of tuple,
> > dfblocks could be a std::vector and you could use setvector, and not
> > sort anything, because it would be deterministic.
> >
> > Whether any of this is worth it, no idea.
> >
> >
> >
> > On Mon, Apr 6, 2015 at 1:56 PM, Cameron Zwarich <zwarich at apple.com>
> wrote:
> >> I just thought it would make aesthetic sense to have less code in the
> inner
> >> loop, but that’s arbitrary. If people think the ‘a’ patch makes more
> sense
> >> and it’s equivalent perfwise, then I’m fine with that. Code is made for
> >> reading, not writing. :P
> >>
> >> I’m not sure SetVector makes sense here for a few reasons:
> >>
> >> 1) The set and the vector store different types; the vector also
> contains
> >> the BB number.
> >> 2) The vector in SetVector is a std::vector, not a SmallVector.
> >> 3) The set in SetVector is a SmallSet of a fixed size of 16 (smaller
> than
> >> the 32 used here), and the ‘large’ case of SmallSet is a std::set,
> which is
> >> probably slower than SmallPtrSet’s ‘large’ case.
> >>
> >> Cameron
> >>
> >> On Apr 6, 2015, at 1:40 PM, Chandler Carruth <chandlerc at google.com>
> wrote:
> >>
> >> I'd like to understand why Cameron would prefer the 'b' patch to the 'a'
> >> patch. AFAICT, the 'b' patch doesn't save any actual memory (sadly).
> >>
> >> Also, if we go with 'a', would it make sense to use a SetVector rather
> than
> >> a separate vector and set?
> >>
> >> On Mon, Apr 6, 2015 at 1:36 PM Daniel Berlin <dberlin at dberlin.org>
> wrote:
> >>>
> >>> I'm running numbers on both approaches now, to see if there is any
> >>> real difference in speed.
> >>>
> >>> (otherwise, i think the one with two visited worklists is easier to
> >>> understand, unless someone else wants to disagree :P)
> >>>
> >>>
> >>> On Mon, Apr 6, 2015 at 11:54 AM, Quentin Colombet <qcolombet at apple.com
> >
> >>> wrote:
> >>>> Hi Cameron,
> >>>>
> >>>> Sounds good.
> >>>>
> >>>> Let us wait the complete testing from Daniel before reviewing the
> patch.
> >>>>
> >>>> Thanks,
> >>>> -Quentin
> >>>>> On Apr 6, 2015, at 11:27 AM, Cameron Zwarich <zwarich at apple.com>
> wrote:
> >>>>>
> >>>>> It was pointed out to me (without any specifics) that the iterated
> >>>>> dominance frontier algorithm in PromoteMemoryToRegister.cpp has
> O(n^2) worst
> >>>>> case behavior.
> >>>>>
> >>>>> I inspected the code and think I found the cause. The code uses a
> >>>>> priority queue and a worklist, which share the same visited set, but
> the
> >>>>> visited set is only updated when inserting into the priority queue.
> The
> >>>>> original Sreedhar-Gao paper effectively has a second visited set
> (the InPhi
> >>>>> flag) which is used for the priority queue, and the set called
> Visited is
> >>>>> used for the recursive traversal that is done here with a worklist.
> >>>>>
> >>>>> I’ve attached two patches, one which just adds a second visited sit,
> >>>>> and another which leverages the fact that one of the visited sets is
> >>>>> actually the IDF. I would prefer the latter if it has equal
> performance with
> >>>>> the first.
> >>>>>
> >>>>> They both pass `make check`, but I’m not sure I’ll have time to give
> >>>>> these patches the testing they’ll deserve in the next few days.
> Daniel
> >>>>> Berlin has offered to test them more thoroughly for me.
> >>>>>
> >>>>> Note that there is still one difference with the paper. The paper
> uses
> >>>>> a custom linked data structure instead of a priority queue, which
> takes
> >>>>> advantage of the property that the level of all nodes being inserted
> is at
> >>>>> most the current level. The code in LLVM uses a priority queue based
> on a
> >>>>> binary heap. This means that the worst case is O(n log n), but I’d be
> >>>>> surprised if the difference matters in practice.
> >>>>>
> >>>>> Cameron
> >>>>>
> >>>>>
> >>>>> <idf-faster-a.patch><idf-faster-b.patch>_______________
> ________________________________
> >>>>> llvm-commits mailing list
> >>>>> llvm-commits at cs.uiuc.edu
> >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >>>>
> >>>
> >>> _______________________________________________
> >>> llvm-commits mailing list
> >>> llvm-commits at cs.uiuc.edu
> >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >>
> >> _______________________________________________
> >> llvm-commits mailing list
> >> llvm-commits at cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >>
> >>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150406/d1253b2a/attachment.html>