[PATCH] Bitcode: Try to emit metadata in function blocks
Mehdi Amini via llvm-commits
llvm-commits at lists.llvm.org
Tue Mar 29 22:23:45 PDT 2016
> On Mar 28, 2016, at 3:42 PM, Mehdi Amini <mehdi.amini at apple.com> wrote:
>
>>
>> On Mar 28, 2016, at 7:44 AM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
>>
>>
>>> On 2016-Mar-27, at 17:41, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
>>>
>>> Whenever metadata is only referenced by a single function, emit the
>>> metadata just in that function block. This should improve lazy-loading
>>> by reducing the amount of metadata in the global block.
>>>
>>> For now, this should catch all DILocations, and anything else that
>>> happens to be referenced only by a single function.
>>>
>>> It's also a first step toward a couple of possible future directions
>>> (which this commit does *not* implement):
>>>
>>> 1. Some debug info metadata is only referenced from compile units and
>>> individual functions. If we can drop the link from the compile
>>> unit, this optimization will get more powerful.
>>>
>>> 2. Any uniqued metadata that isn't referenced globally can in theory be
>>> emitted in every function block that references it (trading off
>>> bitcode size and full-parse time vs. lazy-load time).
>>>
>>> Note: the only change on the reader side is cautious error checking.
>>> The metadata stored in function blocks gets purged after parsing each
>>> function, which means unresolved forward references will get lost.
>>> Since all the global metadata should have already been resolved by the
>>> time we get to the function metadata blocks we just need to check for
>>> that case. (If for some reason we need this, the fix is to store
>>> about-to-be-dropped unresolved nodes in MetadataList::shrinkTo until
>>> they can be handled succesfully by a future call to
>>> MetadataList::tryToResolveCycles.)
>>>
>>> <0001-Bitcode-Try-to-emit-metadata-in-function-blocks-v3.patch>
>>
>> I realized that it's straightforward to use llvm-bcanalyzer to test that
>> this works correctly. Attaching a new version of the patch with a
>> testcase.
>
>
> I wanted to test it locally, but can't ThinLTO-link llvm-tblgen, it crashes with "error: Invalid record: metadata strings corrupt offset"
>
> Some other comments (while I'm still trying to make sense of organizeMetadata()):
>
> + struct MDIndex {
>
> one-line doxygen could be welcome
>
> + struct MDRange {
>
> same
>
> - unsigned NumModuleMDs;
> + unsigned NumModuleMDStrings = 0;
> + unsigned NumBaseMDs = 0;
> + unsigned NumMDStrings = 0;
>
> Same
>
>
> /// Purge function metadata.
> void purgeFunctionMetadata();
>
> Doxygen that are 1-1 matching to the function name are useless, either there is something more useful to say, or it could be dropped if the name makes what it is intended for obvious.
>
>
>
> + // Check whether this is a new function.
> + if (Insertion.first->second.updateFunction(F))
> + if (auto *N = dyn_cast<MDNode>(MD))
> + NeedToDropFunctions.push_back(N);
>
> The dyn_cast is not clear to me?
>
>
> + return std::make_tuple(LHS.F, !isa<MDString>(MDs[LHS.ID - 1]), LHS.ID) <
> + std::make_tuple(RHS.F, !isa<MDString>(MDs[RHS.ID - 1]), RHS.ID);
>
> Things like the "- 1" offset that is leaking here seems like a missing abstraction. Not that it seems to be widespread, but could be hidden in a member of MDIndex.
>
>
> + // Return early if nothing happened.
> + if (!Order.back().F && !isa<MDString>(MDs[Order.front().ID - 1]))
> + return;
>
> I'm glad there is a comment, without it I wouldn't know what's going on here!
I don't have much to add. The reader/writer part is trivial, there is only the ValueEnumerator which might now be intuitive on the first sight. Just a good high level description of what is achieved with the two struct (MDindex and MDRang) should be enough.
So LGTM.
--
Mehdi
More information about the llvm-commits
mailing list