[llvm-dev] [RFC] Refactor llvm-dwp in to a library.

Alexander Yermolovich via llvm-dev llvm-dev at lists.llvm.org
Tue Jun 22 17:36:33 PDT 2021


Hello David,

Thank you for elaborating.
When you are talking about compression, is this related to debug info coming in compressed already, or something else?
Regarding MCStremer what would be the alternative? In Bolt it provides a nice level of abstraction for us as we output new updated binary, and write out dwo files, in debug fission case.

In general, the usage model for BOLT is in some ways similar to llvm-dwp, except we don't really deal with compressed debug information. Some sections are pass through, but others get either modified, .debug_info, or complete re-written, .debug_loc. As an example. For llvm-dwp the .debug-str-offset and .debug-str section gets re-written. Although much more data is modified/replaced before being written out in bolt case. So, I am not sure pure in/out performance is as critical for us at the moment.

I took initial step of factoring out llvm-dwp code in to it's own library. To see what it will look like. What I ended up is with few APIs that take in MCStreamer, and all the code for dealing with it is in main function of llvm-dwp.

With all of this said, and Bolt usage model, I think dealing with MCStreamer issue can be deferred to after refactoring to library/adding functionality to BOLT.

Alex
________________________________
From: David Blaikie <dblaikie at gmail.com>
Sent: Monday, June 21, 2021 6:41 PM
To: Alexander Yermolovich <ayermolo at fb.com>
Cc: llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org>; Maksim Panchenko <maks at fb.com>
Subject: Re: [RFC] Refactor llvm-dwp in to a library.

On Mon, Jun 21, 2021 at 6:28 PM Alexander Yermolovich <ayermolo at fb.com<mailto:ayermolo at fb.com>> wrote:
Hello David

I haven't dug into llvm-dwp performance. What are some of the performance pain points that you know about?

Yeah - using LLVM's higher level abstractions for writing object files (MCStreamer et, al) means that, as far as I recall, all the output ends up buffered in memory before being written out - whereas, ideally, it'd be streamed (memcpy to/from memory mapped files) from input file to output file (potentially through streamed compression/decompression where possible too - another layer of the MCStreamer abstractions that can add cost (though I don't think I implemented support for compressing output in llvm-dwp, though it'd be trivial to add because it's already supported in MCStreamer (but that support does buffer the whole uncompressed and compressed data... ))). Maybe some other things, but that's certainly the top of my list.

- Dave


Thank You
Alex
________________________________
From: Alexander Yermolovich
Sent: Monday, June 21, 2021 6:11 PM
To: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
Cc: dblaikie at gmail.com<mailto:dblaikie at gmail.com> <dblaikie at gmail.com<mailto:dblaikie at gmail.com>>; Maksim Panchenko <maks at fb.com<mailto:maks at fb.com>>
Subject: [RFC] Refactor llvm-dwp in to a library.

Hello

I am working on adding support for bolt (https://github.com/facebookincubator/BOLT/tree/rebased) to write out DWP directly.  I want to re-use as much llvm-dwp functionality as possible.
Plan is to move most of functionality that is now in llvm-dwp in to llvm/lib/DWP, with corresponding header file in llvm/include/llvm/DWP.
In the header files have
getContributionIndex
handleSection
parseCompileUnitHeader
writeStringsAndOffsets
getCUIdentifiers
buildDuplicateError
writeIndex

For structs that are passed around define in the header also.
UnitIndexEntry
CompileUnitHeader
CompileUnitIdentifiers


Thought I would solicit opinions before I dive too deep into this.

Thank You
Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210623/799693e5/attachment-0001.html>


More information about the llvm-dev mailing list