[LLVMbugs] [Bug 388] NEW: Put the Core LLVM classes on a diet

bugzilla-daemon at cs.uiuc.edu bugzilla-daemon at cs.uiuc.edu
Sun Jun 27 13:01:12 PDT 2004


http://llvm.cs.uiuc.edu/bugs/show_bug.cgi?id=388

           Summary: Put the Core LLVM classes on a diet
           Product: libraries
           Version: 1.0
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Core LLVM classes
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: sabre at nondot.org


The Core LLVM IR can, and should, be shrunk a bit.  Here are some ideas:

1. The stuff in Bug 122 should be done.  All of it will help.
2. We currently store the Name string for Value*'s twice: once in the Value
   class, and once in the LLVM symbol table.  Strings can be large, so they
   should be shared.

   I think that the best way to do this is to make the Value class own the
   string (which is needed for when the Value is not inserted into a
   program, such as when it's just created), and the SymbolTable class should
   point to the copy in the Value object.
3. Most of the functionality (COW, editing support, etc) provided by
   std::string is completely unused by LLVM names.  It would be an
   interesting experiment to change the Value class to use a lighter-weight
   string class (one that only stored the length and data, in a pascal-style
   way) and only had get/set methods.
4. The Type class needs to be shrunk a bit.  Most of this is in Bug 122 (which
   will take ~24 bytes out of it), but after that, we should get rid of the
   "UniqueId" word in the Type class.   It is only used by a few clients, and
   they really should be using a private numbering anyway.
5. The Operands list of the User class is much heavier weight than we really
   need.  For those who are unfamiliar, we have an std::vector<Use> that
   represents the operands.  The bad thing about this is that vector has 12
   bytes of overhead above and beyond the data that it stores, as well as any
   space that is allocated but not used (because the vector might grow).  The
   *ONLY* user that this makes any sense at all for is the PHINode class.  All
   others have fixed arity when they are constructed.

   There are three bad things about this.  1) This wastes a ton of space: 12
   bytes for most of the instructions and constants.  2) For objects with
   operands, each instruction is really two memory allocations: one for the
   object and one for the instruction.  3) As a consequence of #2, accessing
   the operands (so I->getOperand(2)) requires two memory dereferences: one
   to get to the operands list object, and one to load the operand.  It would
   be "nice" if this were only one load. 

   The obvious fix for this that I'm hinting at is to just do one memory
   allocation of size "sizeof(Instruction)+NumOperands*sizeof(Use)".  The
   only technical detail with this is that PHI nodes can vary their #operands
   over their lifetime.  The solution to this could be to make getOperand
   virtual (bad idea), replace the vector with a pointer to the operands (which
   would be at the end of the instruction in all cases but the PHI node case),
   or take the PHI operands out of the Operands list (this is bad because it
   will screw all of the LLVM clients up and expose a horrible itf to the
   clients).

   I'm currently leaning towards solution #2.  The other detail is that all of
   those 'new LoadInst' calls will have to change to 'LoadInst::create' or
   something, to allow us to call malloc or operator new with the correct size.
   The BinaryOperator class sets a precedent for this, but it would be some
   (mechanical) work to switch the rest of the clients over.

For reference, here are the sizeof of some common LLVM classes:

Type: 56        StructType: 72
Value: 24
BasicBlock: 48  Argument: 24
Function: 92 GlobalVariable: 56
ConstantSInt: 44
Instruction: 48

Obviously the size of Instruction is much more important than the size of a
Function, for example.  Note that this does not include any indirectly allocated
data, such as operands, the Name, etc.

-Chris



------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.




More information about the llvm-bugs mailing list