[PATCH} Remove '.bc' input support from the frontend

Wed Jun 25 00:16:39 PDT 2014

On 25/06/2014 09:11, David Blaikie wrote:
> On Tue, Jun 24, 2014 at 11:01 PM, Alp Toker <alp at nuanti.com> wrote:
>> On 25/06/2014 08:42, David Blaikie wrote:
>>> On Tue, Jun 24, 2014 at 10:31 PM, Alp Toker <alp at nuanti.com> wrote:
>>>> The attached patch removes code that apparently supported '.bc' binary
>>>> LLVM
>>>> bitcode as a frontend input kind.
>>>>
>>>> There were no tests and it's unclear if it was in a working state.
>>> That's vaguely surprising.
>>>
>>>> I'm
>>>> comfortable removing this because it doesn't really align with the
>>>> frontend's primary objective of diagnosing textual input and lowering
>>>> source
>>>> code.
>>> It seems about as likely that someone would use this as they would use
>>> .ll input files. I'm not sure how likely either is. As a developer I
>>> usually use the llc, etc, tools - but sometimes I pass bitcode to
>>> clang to link (possibly after using llvm-link to link two bitcode
>>> files together & I want a linker invocation that has all the standard
>>> libraries, etc rather than trying to invoke the linker myself or do
>>> two steps of assemble+link).
>>
>> It feels a little odd to push bitcode binaries through the frontend source
>> code parsing and diagnostic paths. Perhaps the driver could handle that?
>>
>> I figured the use case is to produce clang-style diagnostics that can be
>> serialized, formatted to support different IDEs and editors etc. and that
>> would only be relevant to '.ll'.
>>
>> No strong inclination either way though if you know better.
>>
>>
>>>> The removal lets us parse and diagnose '.ll' IR inputs more efficiently
>>>> without duplicating file contents into memory.
>>> Curious - how's that the case?
>>
>> ParseAssembly() parses '.ll' source code and uses a LLVM SourceMgr, so has
>> the same underlying reference counting scheme as clang's SourceManager. That
>> means we get to not copy the whole file into memory:
>>
>>>> -    // FIXME: This is stupid, IRReader shouldn't take ownership.
>>>> -    llvm::MemoryBuffer *MainFileCopy =
>>>> - llvm::MemoryBuffer::getMemBufferCopy(MainFile->getBuffer(),
>>>> -                                           getCurrentFile());
>>>> -
>>>>       llvm::SMDiagnostic Err;
>>>> -    TheModule.reset(ParseIR(MainFileCopy, Err, *VMContext));
>>>> +    TheModule.reset(ParseAssembly(const_cast<MemoryBuffer *>(MainFile),
>>>> nullptr,
>>>> +                                  Err, *VMContext));
>>
>> ParseIR() takes the LazyIRModule path if the input is binary, and that path
>> steals ownership which forces us to always copy.
> Takes ownership of the MemoryBuffer, is that it? You could just create
> a shallow MemoryBuffer (MemoryBuffer::getMemBuffer - it doesn't own
> the underlying memory) and give that to ParseIR - it can own the
> MemoryBuffer all it wants, that doesn't have to copy all that data.
> Would that suffice? (and/or presumably we could fix the acutal FIXME
> and just have that whole API not take ownership, ever?)

That's not a bad idea, but I suspect the resulting LLVM Module's 
materializer will outlive the SourceManager into which its shallow 
memory buffer points.

for getLazyBitcodeModule() we actually *want* to copy the content into 
memory because it's potentially very long-lived. The compiler machinery, 
or the file itself, may be long-gone while the Module is still being 
processed by say a JIT as in clang-interpreter.

So we want the deep copy for parseBitcodeFile() in any case, but I agree 
that your approach would work without the need for refcounting for 
ParseAssembly().

People want to keep '.bc' and we should be able to achieve that by 
calling isBitcodeFile() from clang and using its result to select either 
the binary or source parsing code paths.

Alp.

>
> (CC'ing Lang again - because of the ongoing saga of MemoryBuffer, its
> weird/conditional/dynamic ownership semantics, etc - something he and
> I are having ongoing (if rather unactioned) conversations about)
>
>>
>>>> (If there's desire to keep '.bc' support and someone contributes tests
>>>> for
>>>> it, I'm fine to explore other ways to achieve this optimisation.)
>>
>> If you want to keep the feature, I can look into spinning the memory saving
>> another way.
> I'm not terribly invested in it - just mentioning that I /think/ I've
> relied on this feature on at least a few occasions but it wouldn't be
> hard for me to workaround - just one data point. Others may chime in
> with clearer views.
>
> - David
>
>> Alp.
>>
>>
>>
>>>> Requires the recently posted MemoryBuffer patch.
>>>>
>>>> Alp.
>>>>
>>>> --
>>>> http://www.nuanti.com
>>>> the browser experts
>>>>
>>>>
>>>> _______________________________________________
>>>> cfe-commits mailing list
>>>> cfe-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>>>
>> --
>> http://www.nuanti.com
>> the browser experts
>>

-- 
http://www.nuanti.com
the browser experts