[PATCH] Bitcode: Try to emit metadata in function blocks

Mehdi Amini via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 28 16:05:30 PDT 2016


> On Mar 28, 2016, at 3:46 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> 
>> 
>> On 2016-Mar-28, at 15:42, Mehdi Amini <mehdi.amini at apple.com> wrote:
>> 
>> 
>>> On Mar 28, 2016, at 7:44 AM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
>>> 
>>> 
>>>> On 2016-Mar-27, at 17:41, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
>>>> 
>>>> Whenever metadata is only referenced by a single function, emit the
>>>> metadata just in that function block.  This should improve lazy-loading
>>>> by reducing the amount of metadata in the global block.
>>>> 
>>>> For now, this should catch all DILocations, and anything else that
>>>> happens to be referenced only by a single function.
>>>> 
>>>> It's also a first step toward a couple of possible future directions
>>>> (which this commit does *not* implement):
>>>> 
>>>> 1. Some debug info metadata is only referenced from compile units and
>>>> individual functions.  If we can drop the link from the compile
>>>> unit, this optimization will get more powerful.
>>>> 
>>>> 2. Any uniqued metadata that isn't referenced globally can in theory be
>>>> emitted in every function block that references it (trading off
>>>> bitcode size and full-parse time vs. lazy-load time).
>>>> 
>>>> Note: the only change on the reader side is cautious error checking.
>>>> The metadata stored in function blocks gets purged after parsing each
>>>> function, which means unresolved forward references will get lost.
>>>> Since all the global metadata should have already been resolved by the
>>>> time we get to the function metadata blocks we just need to check for
>>>> that case.  (If for some reason we need this, the fix is to store
>>>> about-to-be-dropped unresolved nodes in MetadataList::shrinkTo until
>>>> they can be handled succesfully by a future call to
>>>> MetadataList::tryToResolveCycles.)
>>>> 
>>>> <0001-Bitcode-Try-to-emit-metadata-in-function-blocks-v3.patch>
>>> 
>>> I realized that it's straightforward to use llvm-bcanalyzer to test that
>>> this works correctly.  Attaching a new version of the patch with a
>>> testcase.
>> 
>> 
>> I wanted to test it locally, but can't ThinLTO-link llvm-tblgen, it crashes with "error: Invalid record: metadata strings corrupt offset"
> 
> Interesting.  If you've got a reproduction that would help.

$ ../install-clang/bin/clang -flto=thin -O2 -g -c lib/Support/regcomp.c -o - | ../install-clang/bin/opt -o /dev/null
<stdin>
error: Invalid record: metadata strings corrupt offset

Removing either -g or -O2 makes the problem go away.

-- 
Mehdi


> 
>> 
>> Some other comments (while I'm still trying to make sense of organizeMetadata()):
>> 
>> +  struct MDIndex {
>> 
>> one-line doxygen could be welcome
>> 
>> +  struct MDRange {
>> 
>> same
>> 
>> -  unsigned NumModuleMDs;
>> +  unsigned NumModuleMDStrings = 0;
>> +  unsigned NumBaseMDs = 0;
>> +  unsigned NumMDStrings = 0;
>> 
>> Same
>> 
>> 
>> /// Purge function metadata.
>> void purgeFunctionMetadata();
>> 
>> Doxygen that are 1-1 matching to the function name are useless, either there is something more useful to say, or it could be dropped if the name makes what it is intended for obvious.
>> 
>> 
>> 
>> +  // Check whether this is a new function.
>> +  if (Insertion.first->second.updateFunction(F))
>> +    if (auto *N = dyn_cast<MDNode>(MD))
>> +      NeedToDropFunctions.push_back(N);
>> 
>> The dyn_cast is not clear to me?
>> 
>> 
>> +    return std::make_tuple(LHS.F, !isa<MDString>(MDs[LHS.ID - 1]), LHS.ID) <
>> +          std::make_tuple(RHS.F, !isa<MDString>(MDs[RHS.ID - 1]), RHS.ID);
>> 
>> Things like the "- 1" offset that is leaking here seems like a missing abstraction. Not that it seems to be widespread, but could be hidden in a member of MDIndex.
>> 
>> 
>> +  // Return early if nothing happened.
>> +  if (!Order.back().F && !isa<MDString>(MDs[Order.front().ID - 1]))
>> +    return;
>> 
>> I'm glad there is a comment, without it I wouldn't know what's going on here!
>> 
>> 
>> -- 
>> Mehdi



More information about the llvm-commits mailing list