[early patch] Speed up decl chaining

Sean Silva silvas at purdue.edu
Fri Oct 18 23:25:38 PDT 2013


On Tue, Oct 8, 2013 at 11:09 PM, Rafael EspĂ­ndola <
rafael.espindola at gmail.com> wrote:

> I found this old incomplete patch while cleaning my git repo. I just
> want to see if it is crazy or not before trying to finish it.
>

What originally motivated this? Did you measure something that made you
think that this had the potential to be faster?


>
> Currently decl chaining is O(n). We use a circular singly linked list
> that points to the previous element and has a bool to say if we are
> the first element (and actually point to the last).
>
> Adding a new decl is O(n) because we have to find the first element by
> walking the prev links. One way to make this O(1) that is sure to work
> is a doubly linked list, but that would be very wasteful in memory.
>
> What this patch does is reverse the list so that a decl points to the
> next decl (or to the first if it is the last). With this chaining
> becomes O(1). The flip side is that getPreviousDecl is now O(n).
>
> In this patch I just got check-clang to work and replaced enough uses
> of getPreviousDecl to get a speedup in
>
>     #define M extern int a;
>     #define M2 M M
>     #define M4 M2 M2
>     #define M8 M4 M4
>     #define M16 M8 M8
>     #define M32 M16 M16
>     #define M64 M32 M32
>     #define M128 M64 M64
>     #define M256 M128 M128
>     #define M512 M256 M256
>     #define M1024 M512 M512
>     #define M2048 M1024 M1024
>     #define M4096 M2048 M2048
>     #define M8192 M4096 M4096
>     #define M16384 M8192 M8192
>     M16384
>
> In my machine this patch takes clang -cc1 on the pre processed version
> of that from 0m4.748s to 0m1.525s.
>

What is this microbenchmark even measuring? Is there any reason to believe
that this is representative enough of anything to guide a decision?

I feel like what's missing here are measurements of the actual behavior of
this code path. For example, how long are we spending walking these
redeclaration chains in real code? On average how long are the
redeclaration chains when compiling real code? Almost always 1? Usually 2?
Generally between 3 and 5? >100? Each of the cases I just listed puts the
situation in a completely different light. Are any particular sites that
call these API's (or particular AST classes) inducing far more link
traversals than other sites when compiling typical code? (i.e., instrument
the "get next link" routine to tally up by call site). Maybe the usage
patterns of some AST nodes benefit more from forward traversal, and others
from backward?


Side note (completely impractical): if you have spare bits in the bottom of
the pointer, then you could store bits of the address of the first decl (or
whichever one is O(n) links away) in each link, so that in the worst case
you only have to walk a constant number of links before you collect all the
bits of the first pointer :)

-- Sean Silva


>
> There are still a lot of uses of getPreviousDecl to go, but can anyone
> see a testecase where this strategy would not work?
>
> Cheers,
> Rafael
>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20131019/02006c20/attachment.html>


More information about the cfe-commits mailing list