[llvm-dev] [RFC] Embedded bitcode and related upstream (Part II)

Jonas Devlieghere via llvm-dev llvm-dev at lists.llvm.org
Mon Jul 25 03:24:01 PDT 2016


Hi,

I hope I'm not breaking any mailing list etiquette by replying to this
mail, but if I am then please accept my apologies.

On Fri, Jun 3, 2016 at 8:36 PM, Steven Wu via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> Hi everyone
>
> I am still in the process of upstreaming some improvements to the embed
> bitcode option. If you want more background, you can read the previous RFC
> (http://lists.llvm.org/pipermail/llvm-dev/2016-February/094851.html). This
> is part II of the discussion.
>
> Current Status:
> A basic version of -fembed-bitcode option is upstreamed and functioning.
> You can use -fembed-bitcode={off, all, bitcode, marker} option to control
> what gets embedded in the final object file output:
> off: default, nothing gets embedded.
> all: optimized bitcode and command line options gets embedded in the object
> file.
> bitcode: only optimized bitcode is embedded
> marker: only put a marker in the object file
>
> What needs to be improved:
> 1. Whitelist for command line options that can be used with bitcode:
> Current trunk implementation embeds all the cc1 command line options (that
> includes header include paths, warning flags and other front-end options) in
> the command line section. That is lot of redundant information. To re-create
> the object file from the embedded optimized bitcode, most of these options
> are useless. On the other hand, they can leak information of the source
> code. One solution will be keeping a list of all the options that can affect
> code generation but not encoded in the bitcode. I have internally prototyped
> with disallowing these options explicitly and allowed only the reminder of
> the  options to be embedded (http://reviews.llvm.org/D17394). A better
> solution might be encoding that information in "Options.td" as specific
> group.
>
> 2. Assembly input handling:
> This is a workaround to allow source code written in assembly to work with
> "-fembed-bitcode" options. When compiling assembly source code with
> "-fembed-bitcode", clang-as creates an empty section "__LLVM, __asm" in the
> object file. That is just a way to distinguish object files compiled from
> assembly source from those compiled from higher level source code but forgot
> to use "-fembed-bitcode" options. Linker can use this section to diagnose if
> "-fembed-bitcode" is consistently used on all the object files participated
> in the linking.
>
> 3. Bitcode symbol hiding:
> There was some concerns for leaking source code information when using
> bitcode feature. One approach to avoid the leak is to add a pass which
> renames all the globals and metadata strings. The also keeps a reverse map
> in case the original name needs to be recovered. The final bitcode should
> contain no more symbols or debug info than a stripped binary. To make sure
> modified bitcode can still be linked correctly, the renaming need to be
> consistent across all bitcode participated in the linking and everything
> that is external of the linkage unit need to be preserved. This means the
> pass can only be run during the linking and requires some LTO api.

Regarding the symbol map, are you planning to upstream a pass that
restores the symbols? I have been trying to do this myself in order to
reverse the "BCSymbolMap". However this turned out to be less
straightforward than I'd hoped. Any info on this would be greatly
appreciated!

> 4. Debug info strip to line-tables pass:
> As the name suggested, this pass strip down the full debug info to
> line-tables only. This is also one of the steps we took to prevent the leak
> of source code information in bitcode.
>
> Please let me know what do you think about the pieces above or if you have
> any concerns about the methodology. I will put up patches for review soon.
>
> Thanks
>
> Steven
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Cheers,
Jonas


More information about the llvm-dev mailing list