[PATCH] MCObjectSymbolizer

Ahmed Bougacha ahmed.bougacha at gmail.com
Mon Oct 14 17:16:57 PDT 2013


Thanks for the welcome change, LGTM!

On Tue, Oct 15, 2013 at 12:49 AM, Stephen Checkoway <s at pahtak.org> wrote:
> MCObjectSymbolizer currently iterates through each symbol every time it is asked to tryAddingSymbolicOperand. With many symbols, this takes a very long time.
>
> The attached patch iterates through the symbols the first time this is needed and puts them in a sorted vector. Subsequent lookups use std::upper_bound() to find the symbol in log(n) time.
>
> Without the patch, calling MCObjectDisassembler::buildModule(/* withCFG */ false) on an unstripped build of Chromium took more than 2 hours (which is when I gave up on it). With the patch, it takes 24 seconds.
>
> My particular build of Chromium has 474,222 symbols (as counted with nm chrome|wc -l) and is 2.1 GB.
>
> It is difficult to test the speedup with llvm-objdump because passing true for withCFG uses more than my available 32 GB of RAM when run on a 7.3 MB file and on smaller files, e.g., my 3 MB ninja binary, the difference in speed is in the noise. And even then, it's using about 6 GB of RAM.

By the way, the excessive memory usage is because of two main reasons
(both are pretty low hanging though):
- the MCModule keeps an MCInst for every disassembled instruction; I
have a WIP patch experimenting with optional MCInst uniquing at the
MCContext level, not sure if there's a better way
- it also keeps redundant address/size information (see
include/llvm/MC/MCAtom.h:109). I didn't get around to working on that
yet.

- Ahmed

> --
> Stephen Checkoway
>
>
>
>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>




More information about the llvm-commits mailing list