<p dir="ltr"><br>

On 19 Oct 2013 05:28, "Rafael Espíndola" <<a href="mailto:rafael.espindola@gmail.com">rafael.espindola@gmail.com</a>> wrote:<br>

><br>

> The code hot path showed up in a build of postgresql.<br>

><br>

> Synthetic benchmarks like these do have their value. They expose bad<br>

> asymptotic behaviour that does show up in user code, but is harder to<br>

> measure.<br>

><br>

> For example, when this benchmark first came to being, the linkage<br>

> computation was non linear and dominated. Fixing it helped existing<br>

> code and moved the hot spot to decl linking. It looks like the hot<br>

> path is back to linkage computation, and we are still a lot slower<br>

> than gcc on this one, so fixing decl chaining will make this a good<br>

> linkage benchmark again.<br>

><br>

> Unbounded super linear algorithms in general provide a minefield that<br>

> is not very user friendly.</p>

<p dir="ltr">I don't see how this fixes the superlinearity; it seems like it just moves it around. Doesn't it make getPreviousDecl linear? Some of the places you've changed from getPreviousDecl to getFirstDecl are also now linear.</p>


<p dir="ltr">The change to setObjectOfFriendDecl looks incorrect: we really wanted to look at the IDNS of the most recent decl, not of the first one.</p>

<p dir="ltr">> On 19 October 2013 02:25, Sean Silva <<a href="mailto:silvas@purdue.edu">silvas@purdue.edu</a>> wrote:<br>

> ><br>

> ><br>

> ><br>

> > On Tue, Oct 8, 2013 at 11:09 PM, Rafael Espíndola<br>

> > <<a href="mailto:rafael.espindola@gmail.com">rafael.espindola@gmail.com</a>> wrote:<br>

> >><br>

> >> I found this old incomplete patch while cleaning my git repo. I just<br>

> >> want to see if it is crazy or not before trying to finish it.<br>

> ><br>

> ><br>

> > What originally motivated this? Did you measure something that made you<br>

> > think that this had the potential to be faster?<br>

> ><br>

> >><br>

> >><br>

> >> Currently decl chaining is O(n). We use a circular singly linked list<br>

> >> that points to the previous element and has a bool to say if we are<br>

> >> the first element (and actually point to the last).<br>

> >><br>

> >> Adding a new decl is O(n) because we have to find the first element by<br>

> >> walking the prev links. One way to make this O(1) that is sure to work<br>

> >> is a doubly linked list, but that would be very wasteful in memory.<br>

> >><br>

> >> What this patch does is reverse the list so that a decl points to the<br>

> >> next decl (or to the first if it is the last). With this chaining<br>

> >> becomes O(1). The flip side is that getPreviousDecl is now O(n).<br>

> >><br>

> >> In this patch I just got check-clang to work and replaced enough uses<br>

> >> of getPreviousDecl to get a speedup in<br>

> >><br>

> >>     #define M extern int a;<br>

> >>     #define M2 M M<br>

> >>     #define M4 M2 M2<br>

> >>     #define M8 M4 M4<br>

> >>     #define M16 M8 M8<br>

> >>     #define M32 M16 M16<br>

> >>     #define M64 M32 M32<br>

> >>     #define M128 M64 M64<br>

> >>     #define M256 M128 M128<br>

> >>     #define M512 M256 M256<br>

> >>     #define M1024 M512 M512<br>

> >>     #define M2048 M1024 M1024<br>

> >>     #define M4096 M2048 M2048<br>

> >>     #define M8192 M4096 M4096<br>

> >>     #define M16384 M8192 M8192<br>

> >>     M16384<br>

> >><br>

> >> In my machine this patch takes clang -cc1 on the pre processed version<br>

> >> of that from 0m4.748s to 0m1.525s.<br>

> ><br>

> ><br>

> > What is this microbenchmark even measuring? Is there any reason to believe<br>

> > that this is representative enough of anything to guide a decision?<br>

> ><br>

> > I feel like what's missing here are measurements of the actual behavior of<br>

> > this code path. For example, how long are we spending walking these<br>

> > redeclaration chains in real code? On average how long are the redeclaration<br>

> > chains when compiling real code? Almost always 1? Usually 2? Generally<br>

> > between 3 and 5? >100? Each of the cases I just listed puts the situation in<br>

> > a completely different light. Are any particular sites that call these API's<br>

> > (or particular AST classes) inducing far more link traversals than other<br>

> > sites when compiling typical code? (i.e., instrument the "get next link"<br>

> > routine to tally up by call site). Maybe the usage patterns of some AST<br>

> > nodes benefit more from forward traversal, and others from backward?<br>

> ><br>

> ><br>

> > Side note (completely impractical): if you have spare bits in the bottom of<br>

> > the pointer, then you could store bits of the address of the first decl (or<br>

> > whichever one is O(n) links away) in each link, so that in the worst case<br>

> > you only have to walk a constant number of links before you collect all the<br>

> > bits of the first pointer :)<br>

> ><br>

> > -- Sean Silva<br>

> ><br>

> >><br>

> >><br>

> >> There are still a lot of uses of getPreviousDecl to go, but can anyone<br>

> >> see a testecase where this strategy would not work?<br>

> >><br>

> >> Cheers,<br>

> >> Rafael<br>

> >><br>

> >> _______________________________________________<br>

> >> cfe-commits mailing list<br>

> >> <a href="mailto:cfe-commits@cs.uiuc.edu">cfe-commits@cs.uiuc.edu</a><br>

> >> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits</a><br>

> >><br>

> ><br>

><br>

> _______________________________________________<br>

> cfe-commits mailing list<br>

> <a href="mailto:cfe-commits@cs.uiuc.edu">cfe-commits@cs.uiuc.edu</a><br>

> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits</a><br>

</p>