[LLVMbugs] [Bug 411] NEW: [SymbolTable] Reconsider one symbol table for each type

Tue Jul 27 20:59:17 PDT 2004

http://llvm.cs.uiuc.edu/bugs/show_bug.cgi?id=411

           Summary: [SymbolTable] Reconsider one symbol table for each type
           Product: libraries
           Version: 1.0
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Core LLVM classes
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: sabre at nondot.org

LLVM currently keeps one symbol table for each type plane in the program.  There
are historical reasons for this, but they aren't particularly relevant any more.  
I think the symbol table should just be a simple wrapper around two maps from
std::string to Value* and Type* (a one level map, not a two level one like we have).

Here are some of the problems that the current symbol table design causes:

1. This is really slow in the common case.  In particular, to get to an element
   in the symbol table, we need to go through two levels of maps (one from type
   -> ValueMap, one from ValueMap -> Value*).  In particular, when gccld'ing
   252.eon, the top two killers are map operations which I believe are due
   directly to the symbol table.

2. Important operations are really slow.  In particular Module::getNamedFunction
   takes linear time with size of the module because we don't know what type the
   named function will have.  This leads to the nastiness we see in the
   Module::getMainFunction() implementation.

3. Linking is a nightmare, and because of this nightmare, we have to have things
   like the function resolution pass, which is a certified disturbing hack.

4. I'm beginning to think that the function scoped symbol table and the module
   scoped symbol table should be different classes entirely (once simplified).
   In particular, the function-scope symbol table cannot have types in it.

5. Another capability that we want is to be able to completely disable the
   function-level symbol table in certain cases, such as release mode.  Names 
   *inside* of a function have no meaning, they are just there to make debugging
   easier.  Why should release mode have to pay for all of those map lookups?

Something that is important to point out is that LLVM is not serving the
front-end at all by having type-specific symbol table planes at the global
scope.  In particular, even a language that wants to allow overloading based on
type, for example, will have to implement name mangling itself: the LLVM types
for a particular method are not necessarily going to be the distinct in the same
cases the source-level types would be (e.g. due to structural equivalence).

So anyway, here is the proposal:

1. Get rid of the type-sensitivity in the symbol table.
2. Split the SymbolTable class into ModuleSymbolTable and FunctionSymbolTable
3. Drop type-tracking from the FunctionSymbolTable.
4. In NDEBUG mode, FunctionSymbolTable would be a noop, simply ignoring the
   names inserted into it (setName on a Instruction, Argument, or BasicBlock
   would be a noop).

Thoughts?

-Chris

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.