[PATCH} Remove '.bc' input support from the frontend

Tue Jun 24 23:11:08 PDT 2014

On Tue, Jun 24, 2014 at 11:01 PM, Alp Toker <alp at nuanti.com> wrote:
>
> On 25/06/2014 08:42, David Blaikie wrote:
>>
>> On Tue, Jun 24, 2014 at 10:31 PM, Alp Toker <alp at nuanti.com> wrote:
>>>
>>> The attached patch removes code that apparently supported '.bc' binary
>>> LLVM
>>> bitcode as a frontend input kind.
>>>
>>> There were no tests and it's unclear if it was in a working state.
>>
>> That's vaguely surprising.
>>
>>> I'm
>>> comfortable removing this because it doesn't really align with the
>>> frontend's primary objective of diagnosing textual input and lowering
>>> source
>>> code.
>>
>> It seems about as likely that someone would use this as they would use
>> .ll input files. I'm not sure how likely either is. As a developer I
>> usually use the llc, etc, tools - but sometimes I pass bitcode to
>> clang to link (possibly after using llvm-link to link two bitcode
>> files together & I want a linker invocation that has all the standard
>> libraries, etc rather than trying to invoke the linker myself or do
>> two steps of assemble+link).
>
>
> It feels a little odd to push bitcode binaries through the frontend source
> code parsing and diagnostic paths. Perhaps the driver could handle that?
>
> I figured the use case is to produce clang-style diagnostics that can be
> serialized, formatted to support different IDEs and editors etc. and that
> would only be relevant to '.ll'.
>
> No strong inclination either way though if you know better.
>
>
>>
>>> The removal lets us parse and diagnose '.ll' IR inputs more efficiently
>>> without duplicating file contents into memory.
>>
>> Curious - how's that the case?
>
>
> ParseAssembly() parses '.ll' source code and uses a LLVM SourceMgr, so has
> the same underlying reference counting scheme as clang's SourceManager. That
> means we get to not copy the whole file into memory:
>
>>> -    // FIXME: This is stupid, IRReader shouldn't take ownership.
>>> -    llvm::MemoryBuffer *MainFileCopy =
>>> - llvm::MemoryBuffer::getMemBufferCopy(MainFile->getBuffer(),
>>> -                                           getCurrentFile());
>>> -
>>>      llvm::SMDiagnostic Err;
>>> -    TheModule.reset(ParseIR(MainFileCopy, Err, *VMContext));
>>> +    TheModule.reset(ParseAssembly(const_cast<MemoryBuffer *>(MainFile),
>>> nullptr,
>>> +                                  Err, *VMContext));
>
>
> ParseIR() takes the LazyIRModule path if the input is binary, and that path
> steals ownership which forces us to always copy.

Takes ownership of the MemoryBuffer, is that it? You could just create
a shallow MemoryBuffer (MemoryBuffer::getMemBuffer - it doesn't own
the underlying memory) and give that to ParseIR - it can own the
MemoryBuffer all it wants, that doesn't have to copy all that data.
Would that suffice? (and/or presumably we could fix the acutal FIXME
and just have that whole API not take ownership, ever?)

(CC'ing Lang again - because of the ongoing saga of MemoryBuffer, its
weird/conditional/dynamic ownership semantics, etc - something he and
I are having ongoing (if rather unactioned) conversations about)

>
>
>>
>>> (If there's desire to keep '.bc' support and someone contributes tests
>>> for
>>> it, I'm fine to explore other ways to achieve this optimisation.)
>
>
> If you want to keep the feature, I can look into spinning the memory saving
> another way.

I'm not terribly invested in it - just mentioning that I /think/ I've
relied on this feature on at least a few occasions but it wouldn't be
hard for me to workaround - just one data point. Others may chime in
with clearer views.

- David

> Alp.
>
>
>
>>>
>>> Requires the recently posted MemoryBuffer patch.
>>>
>>> Alp.
>>>
>>> --
>>> http://www.nuanti.com
>>> the browser experts
>>>
>>>
>>> _______________________________________________
>>> cfe-commits mailing list
>>> cfe-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>>
>
> --
> http://www.nuanti.com
> the browser experts
>