[llvm-dev] End-to-end -fembed-bitcode .llvmbc and .llvmcmd

Sean Bartell via llvm-dev llvm-dev at lists.llvm.org
Sat Aug 29 19:22:00 PDT 2020


On Fri, Aug 28, 2020, at 16:31, Mircea Trofin via llvm-dev wrote:
> 
> 
> On Fri, Aug 28, 2020 at 2:16 PM Fangrui Song <maskray at google.com> wrote:
>> On 2020-08-28, Mircea Trofin via llvm-dev wrote:
>> >On Fri, Aug 28, 2020 at 11:22 AM David Blaikie <dblaikie at gmail.com> wrote:
>> >
>> >> So maybe the goal/desire is to have a different semantic, rather than the
>> >> equivalent semantic being different on ELF compared to MachO.
>> >>
>> >> So if it's a different semantic - yeah, I'd guess a flag that prefixes the
>> >> module metadata with a length would make sense, then it can be linked
>> >> naturally on any platform. (if the "don't link these sections" support on
>> >> Darwin is done by the linker hardcoding the section name - then maybe this
>> >> flag would also put the data in a different section that isn't linker
>> >> stripped on Darwin, so users interested in getting everything linked
>> >> together can do so on any platform)
>> >>
>> >> But if this data is linked, then it'd be hard to know which command line
>> >> goes with which module, yes? So maybe it'd make sense then to have the
>> >> command line as a header before the module, in the same section. So they're
>> >> kept together.
>> >>
>> >This last point was my follow-up :)
>> 
>> A module has a source_filename field.
>> 
>> clang -fembed-bitcode=all -c d/a.c
>> llvm-objcopy --dump-section=.llvmbc=a.bc a.o /dev/null
>> llvm-dis < a.bc => source_filename = "d/a.c"
>> 
>> The missing piece is a mechanism to extract a module from concatenated
>> bitcode (llvm-dis supports multi-module bitcode but not concatenated
>> bitcode https://reviews.llvm.org/D70153). I'll be happy to look into it:)
>> 
>> ---
>> 
>> .llvmcmd may need the source file to be more useful.
> Right - I think, for the non-Darwin concatenated case, all three of us (David, you, and I) are thinking along the lines of keeping together: the module name, the bytecode, and the command line - effectively not using .llvmcmd, and being able to correctly extract, by design, the rest of the information.

Here's the format I would suggest:

1. Put command-line flags in the module metadata instead of .llvmcmd.
2. Put each module in the bitcode wrapper supported by SkipBitcodeWrapperHeader, which includes a length field. I think LLVM only generates the wrapper for Darwin, but it can read the wrapper correctly on all platforms.
3. Change the .llvmbc section alignment so that no extra zeros are added between modules.

My use case: I'm using -fembed-bitcode on Linux as an alternative to the wllvm/whole-program-llvm tool. For my purposes, it'd be nice to also keep track of linker flags and other linker input files, but I can get most of what I need from the modules alone.

Sean


More information about the llvm-dev mailing list