[llvm-dev] Possible Memory Savings for tools emitting large amounts of existing data through MC

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Mon Feb 29 16:36:36 PST 2016


On Mon, Feb 29, 2016 at 4:17 PM, Adrian Prantl <aprantl at apple.com> wrote:

>
> On Feb 29, 2016, at 4:10 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
> On Mon, Feb 29, 2016 at 3:51 PM, Adrian Prantl <aprantl at apple.com> wrote:
>
>>
>> On Feb 29, 2016, at 3:46 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>
>>
>>
>> On Mon, Feb 29, 2016 at 3:36 PM, Adrian Prantl <aprantl at apple.com> wrote:
>>
>>>
>>> On Feb 29, 2016, at 3:18 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>>
>>> Just in case it interests anyone else, I'm playing around with trying to
>>> broaden the MCStreamer API to allow for emission of bytes without copying
>>> the contents into a local buffer first (either because you already have a
>>> buffer, or the bytes are already present in another file, etc) in
>>> http://reviews.llvm.org/D17694 . In theory there's some overlap with
>>> lld here (no doubt it already does this sort of thing, but not in a way, I
>>> assume, we could reuse from other tools at the moment) and my motivation,
>>> llvm-dwp, looks very much like "linking with a few extra steps".
>>>
>>> But to check that these changes might be more generally applicable, I
>>> thought I'd solicit data from anyone building tools that might be memory
>>> constrained as well.
>>>
>>> First that comes to mind (Eric suggested/mentioned) is llvm-dsymutil.
>>>
>>> Adrian/Fred - do you guys ever have trouble with memory usage of
>>> llvm-dsymutil? Do you have an example you could provide that has high
>>> memory usage, so I could see if any simple changes based on my prototype MC
>>> changes would help.
>>>
>>>
>>> Since dsymutil processes object files one after another,
>>>
>>
>> As does llvm-dwp. Think of llvm-dwp more like a linker with a few extra
>> bits. But the MCStreamer API means any bytes you write to the streamer stay
>> in memory until you "Finish" - so if you're dwp/linking large enough
>> inputs, you have them all in memory when you really don't need them. For
>> example, the dwp file I was generating is 7GB, but the tool with the memory
>> improvements only has a high water mark of 2.3GB.
>>
>>
>>> memory usage wasn’t really a problem so far, but you could try running
>>> llvm-dsymutil on bin/clang for a larger example (takes about a minute to
>>> finish).
>>>
>>
>> Was thinking of something more accessible to me, on a non-Darwin
>> platform. Is there a way I can generate the dsym inputs across Clang on a
>> non-Darwin platform? (what happens if I run dsymutil on my ELF object
>> files?)
>>
>>
>> At this point probably nothing. Dsymutil acts on STABS symbol table
>> entries that are (I guess) not present in a typical ELF binary. Dsymutil
>> also only implements MachO relocations and has lots of other things where
>> the ELF implementation is missing. It’s probably not too much work to wire
>> all this up, but so far nobody did it.
>>
>
> & no easy way for me to get a representative (or pathalogically large,
> even) set of machO files to play with, I take it? It's no worries - just
> figured I'd give it a go if it was convenient.
>
>
> I can definitely go and grab you a clang build directory from one of the
> green dragon bots for example; but all the paths are hardcoded so you’d
> have to install them in the exact same location. In theory everything doing
> file access should be handled by the LLVM low-level libraries, so this
> *could* work.
>

If you like/have time, feel free to throw them up somewhere I can download
them from.



>
> -- adrian
>
>
>
>>
>> -- adrian
>>
>>
>>> A quick glance at dsymutil's code indicates it might benefit slightly,
>>> at least - in the string table emission, for example (it looks very similar
>>> to string table emission in dwp - just being able to reference the strings
>>> in the StringMap rather than copying them into MCStreamer could help (also
>>> I found using a DenseMap<StringRef to the memory mapped input helped as
>>> well - but that's a change you can make locally without any MCStreamer
>>> improvements) - other parts might be trickier, and consist of parts of
>>> referencable data (like the line table header)  and parts that are not
>>> referencable (like their contents) - my prototype could be extended to
>>> handle that)
>>>
>>>
>>> -- adrian
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160229/c3ab404a/attachment.html>


More information about the llvm-dev mailing list