<div dir="ltr">Sorry, I clearly meant SmallSetVector (which uses SmallVector and SmallPtrSet).<br><br><div>Also, I was not supposing you would touch the blocks thing at all, nor thinking this would remove any non-determinism. It was just that if you're pushing onto a vector only when insertion into a set vector succeeds, it might make sense. There are still plenty of cases where it doesn't, either due to type mis-matches, or because you want to keep growing the set while shrinking and re-using space in the vector.</div></div><br><div class="gmail_quote">On Mon, Apr 6, 2015 at 2:02 PM Cameron Zwarich <<a href="mailto:zwarich@apple.com">zwarich@apple.com</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Yeah, that’s a bigger problem. My original complaints 2) and 3) are invalid because those are template parameters that could presumedly be swapped out for SmallVector / SmallPtrSet anyways.<br>

<br>

Cameron<br>

<br>

> On Apr 6, 2015, at 2:00 PM, Daniel Berlin <<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>> wrote:<br>

><br>

> Unless i'm missing something, you can't use setvector anyway, because<br>

> it won't remove the non-determinism the sort is there to fix.<br>

><br>

> That non-determinism comes from the fact that defblocks is a<br>

> smallptrset, and even if the priority queue's order is stable, the<br>

> only guarantee you get is that things at the same dom level end up<br>

> around the same place in the priority queue (but not in a specific<br>

> order).<br>

><br>

><br>

> If domtreenodepair was domtreenodetriple (<BasicBlock, DomLevel,<br>

> BBNumbers[BB]>), and sorted by second then third part of tuple,<br>

> dfblocks could be a std::vector and you could use setvector, and not<br>

> sort anything, because it would be deterministic.<br>

><br>

> Whether any of this is worth it, no idea.<br>

><br>

><br>

><br>

> On Mon, Apr 6, 2015 at 1:56 PM, Cameron Zwarich <<a href="mailto:zwarich@apple.com" target="_blank">zwarich@apple.com</a>> wrote:<br>

>> I just thought it would make aesthetic sense to have less code in the inner<br>

>> loop, but that’s arbitrary. If people think the ‘a’ patch makes more sense<br>

>> and it’s equivalent perfwise, then I’m fine with that. Code is made for<br>

>> reading, not writing. :P<br>

>><br>

>> I’m not sure SetVector makes sense here for a few reasons:<br>

>><br>

>> 1) The set and the vector store different types; the vector also contains<br>

>> the BB number.<br>

>> 2) The vector in SetVector is a std::vector, not a SmallVector.<br>

>> 3) The set in SetVector is a SmallSet of a fixed size of 16 (smaller than<br>

>> the 32 used here), and the ‘large’ case of SmallSet is a std::set, which is<br>

>> probably slower than SmallPtrSet’s ‘large’ case.<br>

>><br>

>> Cameron<br>

>><br>

>> On Apr 6, 2015, at 1:40 PM, Chandler Carruth <<a href="mailto:chandlerc@google.com" target="_blank">chandlerc@google.com</a>> wrote:<br>

>><br>

>> I'd like to understand why Cameron would prefer the 'b' patch to the 'a'<br>

>> patch. AFAICT, the 'b' patch doesn't save any actual memory (sadly).<br>

>><br>

>> Also, if we go with 'a', would it make sense to use a SetVector rather than<br>

>> a separate vector and set?<br>

>><br>

>> On Mon, Apr 6, 2015 at 1:36 PM Daniel Berlin <<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>> wrote:<br>

>>><br>

>>> I'm running numbers on both approaches now, to see if there is any<br>

>>> real difference in speed.<br>

>>><br>

>>> (otherwise, i think the one with two visited worklists is easier to<br>

>>> understand, unless someone else wants to disagree :P)<br>

>>><br>

>>><br>

>>> On Mon, Apr 6, 2015 at 11:54 AM, Quentin Colombet <<a href="mailto:qcolombet@apple.com" target="_blank">qcolombet@apple.com</a>><br>

>>> wrote:<br>

>>>> Hi Cameron,<br>

>>>><br>

>>>> Sounds good.<br>

>>>><br>

>>>> Let us wait the complete testing from Daniel before reviewing the patch.<br>

>>>><br>

>>>> Thanks,<br>

>>>> -Quentin<br>

>>>>> On Apr 6, 2015, at 11:27 AM, Cameron Zwarich <<a href="mailto:zwarich@apple.com" target="_blank">zwarich@apple.com</a>> wrote:<br>

>>>>><br>

>>>>> It was pointed out to me (without any specifics) that the iterated<br>

>>>>> dominance frontier algorithm in PromoteMemoryToRegister.cpp has O(n^2) worst<br>

>>>>> case behavior.<br>

>>>>><br>

>>>>> I inspected the code and think I found the cause. The code uses a<br>

>>>>> priority queue and a worklist, which share the same visited set, but the<br>

>>>>> visited set is only updated when inserting into the priority queue. The<br>

>>>>> original Sreedhar-Gao paper effectively has a second visited set (the InPhi<br>

>>>>> flag) which is used for the priority queue, and the set called Visited is<br>

>>>>> used for the recursive traversal that is done here with a worklist.<br>

>>>>><br>

>>>>> I’ve attached two patches, one which just adds a second visited sit,<br>

>>>>> and another which leverages the fact that one of the visited sets is<br>

>>>>> actually the IDF. I would prefer the latter if it has equal performance with<br>

>>>>> the first.<br>

>>>>><br>

>>>>> They both pass `make check`, but I’m not sure I’ll have time to give<br>

>>>>> these patches the testing they’ll deserve in the next few days. Daniel<br>

>>>>> Berlin has offered to test them more thoroughly for me.<br>

>>>>><br>

>>>>> Note that there is still one difference with the paper. The paper uses<br>

>>>>> a custom linked data structure instead of a priority queue, which takes<br>

>>>>> advantage of the property that the level of all nodes being inserted is at<br>

>>>>> most the current level. The code in LLVM uses a priority queue based on a<br>

>>>>> binary heap. This means that the worst case is O(n log n), but I’d be<br>

>>>>> surprised if the difference matters in practice.<br>

>>>>><br>

>>>>> Cameron<br>

>>>>><br>

>>>>><br>

>>>>> <idf-faster-a.patch><idf-<u></u>faster-b.patch>_______________<u></u>______________________________<u></u>__<br>

>>>>> llvm-commits mailing list<br>

>>>>> <a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>

>>>>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/<u></u>mailman/listinfo/llvm-commits</a><br>

>>>><br>

>>><br>

>>> ______________________________<u></u>_________________<br>

>>> llvm-commits mailing list<br>

>>> <a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>

>>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/<u></u>mailman/listinfo/llvm-commits</a><br>

>><br>

>> ______________________________<u></u>_________________<br>

>> llvm-commits mailing list<br>

>> <a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>

>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/<u></u>mailman/listinfo/llvm-commits</a><br>

>><br>

>><br>

<br>

</blockquote></div>