[LLVMdev] PATCH: Use size reduction -- wave2
gohman at apple.com
Wed Apr 16 11:25:22 PDT 2008
On Apr 16, 2008, at 2:50 AM, heisenbug wrote:
> And now here is my educated speculation:
> There are 2 things that became slower
> 1) Use::getUser()
> 2) Use::get/set due to tagging.
> The former is seldom called:
> $ find lib -name "*.cpp" | xargs grep "getUser(" | wc -l
The majority of those aren't actually Use::getUser, but on the
other hand this grep misses all the users of
value_use_iterator::operator*, which is much more popular.
Unfortunately, overloaded operators are not the easiest to
grep for ;-).
> The second is counterbalanced with a faster access to the Use object
> in most cases:
> With exception of PHINode and SwitchInst, the getOperand() function
> (if called on a specialized "this" pointer) does a "this"-relative
> access instead of getting OperandList pointer first and going thru
> that. This was the case before for fixed-arity instructions, but now
> it applies to variadic ones too that cannot perform operand list
> Some things got faster, like
> 1) getOperand access on (say) CallInst is now fetching operands from
> relative to "this". Also, there is no "new Use[n]" allocations (and
> delete ) for these any more.
This is fine, but technically it's an independent change that could
be done without the getUser algorithm changes and tagging.
> 2) no need to maintain Use::U
> 3) data-cache related gains on smaller Use objects (not yet activated)
> So, my idea is that these changes are performance neutral.
> There are potentially more speedups to be realized:
> - indexing operands in the decreasing direction eliminates the
> getNumOperands() call in getOperand() for variadic arity instructions.
At a glance, the only getOperand() that I saw that uses getNumOperands()
is ReturnInst's, and that actually looks like a bug.
> - shaving off (unneeded) operand related administrative members from
> I hope that this is interesting, but I'd like to ask anybody who is
> comfortable with performance testing to help provide some hard
> data :-)
I agree that performance testing is not easy and requires resources,
but I'm curious what's motivating this project here. My assumption
has been that no one would likely go to the extreme of inventing a
mini-language encoded in the least significant bits of pointer
members in successive structs unless they were pretty desperate for
performance or scalability.
More information about the llvm-dev