[llvm-dev] [RFC] Embedded bitcode and related upstream (Part II)
Eric Christopher via llvm-dev
llvm-dev at lists.llvm.org
Sun Jun 12 23:44:05 PDT 2016
Great to see the commentary and updates here. I've got a few questions
about some of this work. It might be nice to see some separate RFCs for a
couple of things, but we'll figure that out after you send out patches
What needs to be improved:
> 1. Whitelist for command line options that can be used with bitcode:
> Current trunk implementation embeds all the cc1 command line options (that
> includes header include paths, warning flags and other front-end options)
> in the command line section. That is lot of redundant information. To
> re-create the object file from the embedded optimized bitcode, most of
> these options are useless. On the other hand, they can leak information of
> the source code. One solution will be keeping a list of all the options
> that can affect code generation but not encoded in the bitcode. I have
> internally prototyped with disallowing these options explicitly and allowed
> only the reminder of the options to be embedded (
> http://reviews.llvm.org/D17394). A better solution might be encoding that
> information in "Options.td" as specific group.
This is really interesting. I'm not a particularly security minded person
so I don't have a lot of commentary there. An explicit whitelist sounds a
bit painful to keep maintained, explicitly having a group in Options.td
sounds pretty nice. You'll need to add them to multiple groups, but it
seems pretty nice.
> 2. Assembly input handling:
> This is a workaround to allow source code written in assembly to work with
> "-fembed-bitcode" options. When compiling assembly source code with
> "-fembed-bitcode", clang-as creates an empty section "__LLVM, __asm" in the
> object file. That is just a way to distinguish object files compiled from
> assembly source from those compiled from higher level source code but
> forgot to use "-fembed-bitcode" options. Linker can use this section to
> diagnose if "-fembed-bitcode" is consistently used on all the object files
> participated in the linking.
I'm surprised you want a separate and empty section and not a header flag
as those are easier to keep around and won't take up a precious mach-o
section. There are probably other options here as well. There are probably
other options or concerns that someone shipping bitcode might have here as
well, but I'm sure those are being talked about - doesn't have too much
affect on the community though.
3. Bitcode symbol hiding:
> There was some concerns for leaking source code information when using
> bitcode feature. One approach to avoid the leak is to add a pass which
> renames all the globals and metadata strings. The also keeps a reverse map
> in case the original name needs to be recovered. The final bitcode should
> contain no more symbols or debug info than a stripped binary. To make sure
> modified bitcode can still be linked correctly, the renaming need to be
> consistent across all bitcode participated in the linking and everything
> that is external of the linkage unit need to be preserved. This means the
> pass can only be run during the linking and requires some LTO api.
How are you planning to ensure the safety of the reverse map? Seems that
requiring linking is a bit icky, but might work. Are you mostly worried
about function names that could be stripped out? What LTO api are you
> 4. Debug info strip to line-tables pass:
> As the name suggested, this pass strip down the full debug info to
> line-tables only. This is also one of the steps we took to prevent the leak
> of source code information in bitcode.
I'm very curious about what's going on here. Could you elaborate? :)
Thanks a ton for the update - glad to see this being worked on!
> Please let me know what do you think about the pieces above or if you have
> any concerns about the methodology. I will put up patches for review soon.
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev