[cfe-dev] [PATCH] Wrap clang modules inside Mach-O/ELF/COFF containers

Sean Silva chisophugis at gmail.com
Thu Feb 19 13:09:36 PST 2015


On Wed, Feb 18, 2015 at 7:08 PM, Richard Smith <richard at metafoo.co.uk>
wrote:

> Sorry for the delay. Generally, I'm happy with the direction here; if you
> want to go ahead with this as-is and deal with the below as incremental
> changes on top of the current patch, I'm fine with that.
>
>
> +  std::unique_ptr<SmallVectorImpl<char>> SerializedASTBuffer;
>
> This seems like a strange data type to be dealing in. There's no point in
> putting a SmallVector into a unique_ptr, since that implies it will always
> be heap-allocated and thus never "small".
>

IIRC SmallVector<PodT> is slightly nicer with large buffers since it can
use realloc vm tricks to resize for POD, whereas the semantics of
std::vector don't permit that (see the post-commit on r175906). Not sure if
that was actually the motivation for SmallVector here, but perhaps.

-- Sean Silva


>
> I also feel somewhat uncomfortable about giving the
> ModuleContainerGenerator an owning pointer to this SmallVectorImpl while
> another class also has a reference to it and will be updating it. You're
> implicitly relying on PCHGenerator::HandleTranslationUnit to finalize the
> AST file contents before ModuleContainerGenerator::HandleTranslationUnit
> writes out the data.
>
> I think I'd prefer if the PCHGenerator owned its vector of data, and was
> given some kind of callback that it used to provide the AST contents when
> it was finished; that could then be hooked up to the
> ModuleContainerGenerator, which could at least assert if it's given the
> data too late (after its HandleTranslationUnit is called).
>
>
> +  CGOpts.OptimizationLevel = 0;
>
> A comment explaining what this is for would be useful.
>
> On Mon, Feb 16, 2015 at 4:35 PM, Adrian Prantl <aprantl at apple.com> wrote:
>
>>
>> On Jan 30, 2015, at 6:04 PM, Adrian Prantl <aprantl at apple.com> wrote:
>>
>>
>> On Jan 27, 2015, at 5:12 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>>
>> On Tue, Jan 27, 2015 at 9:29 AM, Adrian Prantl <aprantl at apple.com> wrote:
>>
>>>
>>> > On Jan 26, 2015, at 5:34 PM, Richard Smith <richard at metafoo.co.uk>
>>> wrote:
>>> >
>>> > On Mon, Jan 26, 2015 at 4:05 PM, Adrian Prantl <aprantl at apple.com>
>>> wrote:
>>> >> Here is the final version of the patch that should address all the
>>> points
>>> >> raised so far in this thread. Any final
>>> comments/objections/suggestions
>>> >> before this can go into trunk?
>>>
>>
>> Hi all,
>>
>> Here’s the upgraded version of the patch, which addresses all of your
>> suggestions. The second patch enables debug info output, so you can see the
>> first one in context and know where this is heading.
>>
>> >
>>> > Using a fixed string literal as a key for the PCH data seems a bit
>>> > weird to me. I'm also not sure a module flag is the best way to model
>>> > this: can we instead emit this as a normal global, and keep the
>>> > knowledge of its special section name within Clang?
>>> That sounds like a good idea — I wasn’t aware that globals could be
>>> annotated with a section name.
>>> I’ll give it a shot.
>>>
>>
>> This works really great for ELF and Mach-O; using a global symbol with a
>> custom section allowed me to get rid of the LLVM patch entirely! The big
>> drawback is that I can no longer influence the order in which the sections
>> are emitted, which turns out to be a problem for COFF.
>> Although COFF sections are marked with the correct alignment
>> characteristics, they are not actually emitted aligned on disk. This is
>> hopefully jus a bug in our COFF backend. Since the DWARF sections are of
>> arbitrary size and come before the serialized AST, this messes up the
>> alignment for OnDiskHashTable. Until I figure out a solution for this
>> problem, I’ve disabled module debug info ouput for COFF.
>>
>> >
>>> > I would imagine that you'll want to generate debug information via
>>> > Clang's normal ASTConsumer -> CodeGen pipeline, which suggests that
>>> > building a custom LLVM code generator and running it from within
>>> > PCHGenerator::HandleTranslationUnit is the wrong choice. This should
>>> > probably be handled by a FrontendAction that passes the bitcode
>>> > generated by the PCHGenerator to CodeGen. That'd also avoid you
>>> > needing to plumb a CompilerInstance into Serialization, which seems
>>> > like a layering violation (Frontend depends on Serialization, not vice
>>> > versa).
>>>
>>> Thanks for the pointer. In this case both the code generator and the
>>> output buffer for the serialized AST is owned by a FrontendAction which
>>> then wraps it in a global and passes it on to CodeGen.
>>> In order to wire up the debug information, since I can only have one
>>> ASTConsumer per FrontendAction, I will probably need to have a second
>>> FrontendAction for the debug info generator?
>>>
>>
>> If you need it, it shouldn't be hard to write an ASTConsumer
>> implementation that forwards calls to multiple other consumers. You'll
>> probably need to ensure that PCHGenerator is first in the sequence to
>> ensure that it can emit the PCH data before we finalize the IR.
>>
>>
>> I found ASTMultiplexConsumer, which does exactly what I needed (after I
>> added all the missing forwarding methods).
>>
>>
>> I’ll check out how clang works and how to communicate between
>>> FrontendActions.
>>> I might be back with more questions :-)
>>>
>>> >
>>> > For the section name (repeating a comment from before): I don't like
>>> > "cfe"; we should indicate which compiler is involved here. I don't
>>> > like "pch"; this is used for all kinds of AST files, not just PCH
>>> > files. Can you pick a better name for the section?
>>>
>>> That particular name was a suggestion by David Majnemer  who pointed out
>>> that COFF section names are restricted to 8 characters. I chose PCH because
>>> the magic number in .pcm files is CPCH.
>>> The full list of constraints is:
>>> - 8 characters (COFF)
>>> - starts with __ (MachO)
>>>
>>
>> How about clangast on COFF and ELF, and __clangast on MachO? Or do we
>> need to keep the name the same in all cases?
>>
>>
>> Since the name is only used in two locations, I think having a
>> platform-specific one isn’t too bad. I’ve implemented your suggestion.
>>
>> thanks for all the feedback so far!
>> -- adrian
>> <0001-CFE-Wrap-clang-module-files-in-a-Mach-O-ELF-COFF-con.patch>
>> <0002-Clang-emits-Debug-Info-for-Modules.patch>
>>
>>
>>
>>> - should indicate clang (COFF. On ELF and MachO the section is inside a
>>> __CLANG segment)
>>> - should indicate that this is an ast
>>>
>>> If we want to use the same section name on all platforms, this leaves
>>> any combination of
>>> __{cfe|clg|cln}{ast|pch|pcm|mod}
>>> or maybe
>>> __ast
>>> __cpch
>>> __llvmod
>>>
>>> other suggestions?
>>> -- adrian
>>
>>
>>
>> Pinging Richard.
>> Thanks to David Majnemers alignment fix for the COFF object writer in
>> r228879, this now works with all supported object containers.
>>
>> -- adrian
>>
>>
>>
>>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20150219/5a7e867a/attachment.html>


More information about the cfe-dev mailing list