[cfe-dev] [PATCH] Wrap clang modules inside Mach-O/ELF/COFF containers

Argyrios Kyrtzidis kyrtzidis at apple.com
Thu Dec 11 15:37:40 PST 2014


> On Dec 11, 2014, at 3:08 PM, Richard Smith <richard at metafoo.co.uk> wrote:
> 
> On Thu, Dec 11, 2014 at 3:00 PM, Argyrios Kyrtzidis <kyrtzidis at apple.com <mailto:kyrtzidis at apple.com>> wrote:
> 
>> On Dec 11, 2014, at 2:59 PM, Richard Smith <richard at metafoo.co.uk <mailto:richard at metafoo.co.uk>> wrote:
>> 
>> On Thu, Dec 11, 2014 at 2:40 PM, Argyrios Kyrtzidis <kyrtzidis at apple.com <mailto:kyrtzidis at apple.com>> wrote:
>> The .pcm file is currently independent of debug info, meaning the compiler invocation will be able to use the same .pcm file regardless of whether the invocation had enabled debug info or not;
>> 
>> We can't use the same .pcm file for -DNDEBUG vs -UNDEBUG builds. Do we ever get to reuse a .pcm file like this in practice?
> 
> You can choose to add, or not to add, debug info to a release build.
> 
> Sure, I don't dispute that this .pcm reuse can happen in theory. But what I'm wondering is: Does this actually happen in practice? How often? Is this case worth optimizing for?
> 
> There are other things I'd like to bundle with a .pcm file (.o and .ir code for inline functions, for instance) that would also benefit from using an ELF wrapper format, and would also vary based on clang's CodeGen options. One possible approach would be to have (at least) two files -- one CodeGen-independent AST file, and one CodeGen-dependent file containing all the other bits -- but that seems to introduce complexity that is unnecessary in almost all cases. (Also note that even flags like -O or -fsanitize=address cause us to build different .pcm files today, because they affect preprocessor macros.)

I don’t see the reason to make the module file itself the container, particularly when whatever the container may contain doesn’t affect in any way the semantic info that the module file is supposed to provide, we just proliferate module files and/or rebuild module files unnecessarily.
It’s true that the situation is not ideal currently and we have -O[1 ~ 3] reusing the .pcm but -Os does not, but in the future we could try to address this, not make the situation fundamentally worse and inescapable. I’d like that modules not turn into a “glorified PCH system" where there is practically zero re-use for them.

Back to the debug info, why not have the container like this

Foundation.pcm.o
   \
  Foundation.pcm

where the container references the .pcm file, and you can put the debug info in it (or ir later on).

Debug info can reference Foundation.pcm.o and get extended to handle the serialized AST from .pcm.


>> with this change if an invocation had built a module file with debug info disabled, it would be inapplicable to the same invocation that had debug info enabled and would have to rebuild it; essentially we are tying module building with debug info. The module file as the “collection of semantic info” is conceptually independent from debug info.
>> 
>> Did you consider having the debug info container being another file (e.g. besides the .pcm) that will reference the .pcm file ? This way, instead of having to update all users of module files, regardless if they care about debug info or not, you’d just make debug info another user of .pcm files, no more special than the others.
>> 
>> > On Dec 10, 2014, at 2:27 PM, Adrian Prantl <aprantl at apple.com <mailto:aprantl at apple.com>> wrote:
>> >
>> > Hi everyone,
>> >
>> > As the first step in preparation for module debugging (see http://lists.cs.uiuc.edu/pipermail/cfe-dev/2014-November/040076.html <http://lists.cs.uiuc.edu/pipermail/cfe-dev/2014-November/040076.html>) this patch turns the *.pcm files that are used to store clang modules and precompiled headers in a platform-dependent Mach-O/ELF/COFF container, so that eventually we will be able to store debug information alongside the module in the same file.
>> >
>> > This is implemented by using the standard LLVM code generation machinery. Instead of directly writing to the output file, the serialized AST blob is attached to an empty llvm::Module as a ModuleFlag. The module is passed to the backend which emits the AST blob into a special “__clang_pch" section in TargetLoweringObjectFile*.
>> > On the ASTReader side, any object file is transparently unwrapped and the BitstreamReader is pointed directly to the AST section.
>> >
>> > Other than the .pcm files having an extra header inside, this patch is not meant to have any user-visible effects.
>> >
>> > Known bugs: I still need to figure out how to make c-index-test link against and register the available targets (check-all passes, but the modules created by c-index-test currently are plain old .pcm files).
>> > Open questions: I made up the name of the new __clang_pch section and the various flags on the different platforms on the spot. I’m open to better suggestions.
>> >
>> > Let me know what you think!
>> >
>> > -- adrian
>> > <clang.diff><llvm.diff>_______________________________________________
>> > cfe-dev mailing list
>> > cfe-dev at cs.uiuc.edu <mailto:cfe-dev at cs.uiuc.edu>
>> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev <http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev>
>> 
>> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20141211/609e1da9/attachment.html>


More information about the cfe-dev mailing list