[LLVMbugs] [Bug 11952] New: tblgen produced code for DiagnosticIDs.o is monstrous and slow

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Wed Feb 8 16:49:12 PST 2012


http://llvm.org/bugs/show_bug.cgi?id=11952

             Bug #: 11952
           Summary: tblgen produced code for DiagnosticIDs.o is monstrous
                    and slow
           Product: clang
           Version: unspecified
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Frontend
        AssignedTo: unassignedclangbugs at nondot.org
        ReportedBy: clattner at apple.com
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified


In a release without asserts build, DiagnosticIDs.o is over 800K, which is
insane.    It is mostly strings and data at least, but this still could be
greatly improved.

An initial analysis of this shows that the huge majority of this (~520K) is the
StaticDiagInfo table.  Each entry of this table (on x86-64) is 56 bytes, which
is mostly consumed by (mostly identical!) string pointers and 4-bytes of
padding.  An example is:

    .short    0                       ## 0x0
    .byte    100                     ## 0x64
    .byte    0                       ## 0x0
    .byte    20                      ## 0x14
    .byte    0                       ## 0x0
    .short    25                      ## 0x19
    .short    0                       ## 0x0
    .short    0                       ## 0x0
    .space    4
    .quad    L_.str
    .quad    L_.str1
    .quad    L_.str2
    .quad    L_.str1
    .quad    L_.str1

These strings are:

L_.str:                                 ## @.str
    .asciz     "err_cannot_open_file"

L_.str1:                                ## @.str1
    .space    1

L_.str2:                                ## @.str2
    .asciz     "cannot open file '%0': %1"


Beyond not encoding "str1" three times, these 5 8-byte pointers (all of which
must be relocated by the dynamic linker at startup time!) could be replaced by
an index into a large string table, the same we do for ::printInstruction in
(e.g.) X86GenAsmWriter.inc:

  const char *AsmStrs = 
    "DBG_VALUE\000BUNDLE\000aaa\000aad\t\000aam\t\000aas\000fabs\000#ACQUIRE"
    "_MOV PSEUDO!\000adcw\t\000adcl\t\000adcq\t\000adcb\t\000addw\t\000addl\t"
    "\000addq\t\000addb\t\000addpd\t\000addps\t\000addsd\t\000addss\t\000add"
    "subpd\t\000addsubps\t\000fadds\t\000faddl\t\000fiadds\t\000fiaddl\t\000"

If they are all nul terminated, we also don't need to encode their lengths as
shorts in the struct, saving even more space.

-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.



More information about the llvm-bugs mailing list