[LLVMbugs] [Bug 21264] New: non-modularized headers in modular build leads to quadratic compile time due to long redecl chains

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Mon Oct 13 14:55:42 PDT 2014


http://llvm.org/bugs/show_bug.cgi?id=21264

            Bug ID: 21264
           Summary: non-modularized headers in modular build leads to
                    quadratic compile time due to long redecl chains
           Product: clang
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Modules
          Assignee: unassignedclangbugs at nondot.org
          Reporter: richard-llvm at metafoo.co.uk
                CC: chandlerc at gmail.com, dgregor at apple.com,
                    llvmbugs at cs.uiuc.edu
    Classification: Unclassified

If we import O(N) modules, each of which does this:

  #include "non-modular-header.h"
  // ... more stuff ...

then every declaration in "non-modular-header.h" has O(N) declarations in its
redecl chain, and getCanonicalDecl() calls on a most-recent declaration take
O(N) time. This appears to be the biggest performance problem in builds with
lots of modules right now.

This is also a problem for non-modular builds, but is less severe since it's
rare to get a long redeclaration chain.

Proposed fix: remove the current PrevPtr from redeclarable Decls, and replace
it with:
-- in a non-canonical redeclarable Decl, a pointer to the canonical decl
-- in a canonical redeclarable Decl with no redeclarations, null
-- in a canonical redeclarable Decl with only one redeclaration, a pointer to
that redeclaration
-- in a canonical redeclarable Decl with two or more redecls (a fairly rare
case), a pointer to an array of pointers to those Decls.

We have 3 spare bits in the bottom of this pointer; we can use them to store
the index of the declaration in the redecl table (this also covers storing
whether the declaration is canonical). To cover longer chains, we can store
(say) log_2(index) rather than index; that lets us perform getPreviousDecl() in
log_2(N) time rather than N time, for N up to ~256. In the rare case that we go
beyond that, we can store a Decl->index hash map along with the array, to make
the getPreviousDecl() performance constant-time.

One final concern is storing the generation number and external AST source
pointer for the canonical declaration in the case where modules is enabled.
These can also be added to the array of pointers, but we'll need to use that
representation even when there is only one redeclaration in the modules case.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20141013/ac854f3a/attachment.html>


More information about the llvm-bugs mailing list