[LLVMbugs] [Bug 388] NEW: Put the Core LLVM classes on a diet
bugzilla-daemon at cs.uiuc.edu
bugzilla-daemon at cs.uiuc.edu
Sun Jun 27 13:01:12 PDT 2004
Summary: Put the Core LLVM classes on a diet
Component: Core LLVM classes
AssignedTo: unassignedbugs at nondot.org
ReportedBy: sabre at nondot.org
The Core LLVM IR can, and should, be shrunk a bit. Here are some ideas:
1. The stuff in Bug 122 should be done. All of it will help.
2. We currently store the Name string for Value*'s twice: once in the Value
class, and once in the LLVM symbol table. Strings can be large, so they
should be shared.
I think that the best way to do this is to make the Value class own the
string (which is needed for when the Value is not inserted into a
program, such as when it's just created), and the SymbolTable class should
point to the copy in the Value object.
3. Most of the functionality (COW, editing support, etc) provided by
std::string is completely unused by LLVM names. It would be an
interesting experiment to change the Value class to use a lighter-weight
string class (one that only stored the length and data, in a pascal-style
way) and only had get/set methods.
4. The Type class needs to be shrunk a bit. Most of this is in Bug 122 (which
will take ~24 bytes out of it), but after that, we should get rid of the
"UniqueId" word in the Type class. It is only used by a few clients, and
they really should be using a private numbering anyway.
5. The Operands list of the User class is much heavier weight than we really
need. For those who are unfamiliar, we have an std::vector<Use> that
represents the operands. The bad thing about this is that vector has 12
bytes of overhead above and beyond the data that it stores, as well as any
space that is allocated but not used (because the vector might grow). The
*ONLY* user that this makes any sense at all for is the PHINode class. All
others have fixed arity when they are constructed.
There are three bad things about this. 1) This wastes a ton of space: 12
bytes for most of the instructions and constants. 2) For objects with
operands, each instruction is really two memory allocations: one for the
object and one for the instruction. 3) As a consequence of #2, accessing
the operands (so I->getOperand(2)) requires two memory dereferences: one
to get to the operands list object, and one to load the operand. It would
be "nice" if this were only one load.
The obvious fix for this that I'm hinting at is to just do one memory
allocation of size "sizeof(Instruction)+NumOperands*sizeof(Use)". The
only technical detail with this is that PHI nodes can vary their #operands
over their lifetime. The solution to this could be to make getOperand
virtual (bad idea), replace the vector with a pointer to the operands (which
would be at the end of the instruction in all cases but the PHI node case),
or take the PHI operands out of the Operands list (this is bad because it
will screw all of the LLVM clients up and expose a horrible itf to the
I'm currently leaning towards solution #2. The other detail is that all of
those 'new LoadInst' calls will have to change to 'LoadInst::create' or
something, to allow us to call malloc or operator new with the correct size.
The BinaryOperator class sets a precedent for this, but it would be some
(mechanical) work to switch the rest of the clients over.
For reference, here are the sizeof of some common LLVM classes:
Type: 56 StructType: 72
BasicBlock: 48 Argument: 24
Function: 92 GlobalVariable: 56
Obviously the size of Instruction is much more important than the size of a
Function, for example. Note that this does not include any indirectly allocated
data, such as operands, the Name, etc.
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
More information about the llvm-bugs