[PATCH] Bitcode: Try to emit metadata in function blocks
Duncan Exon Smith via llvm-commits
llvm-commits at lists.llvm.org
Tue Mar 29 08:16:13 PDT 2016
+llvm-commits [looks like I dropped it somehow]
Great. That confirms that the patch works, and perhaps even stands on its own.
-- dpnes
> On Mar 29, 2016, at 7:50 AM, Mehdi Amini <mehdi.amini at apple.com> wrote:
>
> I'm glad you asked :)
>
> The allocated memory during function import was 357MB (total peak ~450MB, 60MB DILocations, 60MB Subprograms, 43MB local variables, 42MB MDStrings, 36MB MDTuples, 16MB DIDerivedTypes, 15MB DISubRoutineType, 9.5MB DICompositeType).
>
> After is the same... minus the DILocations that don't show up!
>
>
>
>
>> On Mar 29, 2016, at 6:57 AM, Duncan Exon Smith <dexonsmith at apple.com> wrote:
>>
>> Peak memory? Does DILocation still show up as 100MB?
>>
>> -- dpnes
>>
>>> On Mar 29, 2016, at 2:36 AM, Mehdi Amini <mehdi.amini at apple.com> wrote:
>>>
>>> OK, so I tested the patch but didn't see much runtime improvements.
>>>
>>> On opt.cpp there are 120274 Metadata total, and 8789 will be localized (7679 DILocation total, all of them are localized).
>>>
>>> It seems that we'll have to wait for Adrian's removal of the Subprograms field on the CU to start seeing real gains.
>>>
>>> --
>>> Mehdi
>>>
>>>>> On Mar 28, 2016, at 10:33 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
>>>>>
>>>>>
>>>>>> On 2016-Mar-28, at 16:05, Mehdi Amini <mehdi.amini at apple.com> wrote:
>>>>>>
>>>>>>
>>>>>>> On Mar 28, 2016, at 3:46 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 2016-Mar-28, at 15:42, Mehdi Amini <mehdi.amini at apple.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>>> On Mar 28, 2016, at 7:44 AM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>> On 2016-Mar-27, at 17:41, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
>>>>>>>>>
>>>>>>>>> Whenever metadata is only referenced by a single function, emit the
>>>>>>>>> metadata just in that function block. This should improve lazy-loading
>>>>>>>>> by reducing the amount of metadata in the global block.
>>>>>>>>>
>>>>>>>>> For now, this should catch all DILocations, and anything else that
>>>>>>>>> happens to be referenced only by a single function.
>>>>>>>>>
>>>>>>>>> It's also a first step toward a couple of possible future directions
>>>>>>>>> (which this commit does *not* implement):
>>>>>>>>>
>>>>>>>>> 1. Some debug info metadata is only referenced from compile units and
>>>>>>>>> individual functions. If we can drop the link from the compile
>>>>>>>>> unit, this optimization will get more powerful.
>>>>>>>>>
>>>>>>>>> 2. Any uniqued metadata that isn't referenced globally can in theory be
>>>>>>>>> emitted in every function block that references it (trading off
>>>>>>>>> bitcode size and full-parse time vs. lazy-load time).
>>>>>>>>>
>>>>>>>>> Note: the only change on the reader side is cautious error checking.
>>>>>>>>> The metadata stored in function blocks gets purged after parsing each
>>>>>>>>> function, which means unresolved forward references will get lost.
>>>>>>>>> Since all the global metadata should have already been resolved by the
>>>>>>>>> time we get to the function metadata blocks we just need to check for
>>>>>>>>> that case. (If for some reason we need this, the fix is to store
>>>>>>>>> about-to-be-dropped unresolved nodes in MetadataList::shrinkTo until
>>>>>>>>> they can be handled succesfully by a future call to
>>>>>>>>> MetadataList::tryToResolveCycles.)
>>>>>>>>>
>>>>>>>>> <0001-Bitcode-Try-to-emit-metadata-in-function-blocks-v3.patch>
>>>>>>>>
>>>>>>>> I realized that it's straightforward to use llvm-bcanalyzer to test that
>>>>>>>> this works correctly. Attaching a new version of the patch with a
>>>>>>>> testcase.
>>>>>>>
>>>>>>>
>>>>>>> I wanted to test it locally, but can't ThinLTO-link llvm-tblgen, it crashes with "error: Invalid record: metadata strings corrupt offset"
>>>>>>
>>>>>> Interesting. If you've got a reproduction that would help.
>>>>>
>>>>> $ ../install-clang/bin/clang -flto=thin -O2 -g -c lib/Support/regcomp.c -o - | ../install-clang/bin/opt -o /dev/null
>>>>> <stdin>
>>>>> error: Invalid record: metadata strings corrupt offset
>>>>>
>>>>> Removing either -g or -O2 makes the problem go away.
>>>>>
>>>>> --
>>>>> Mehdi
>>>>
>>>> Fixed in r264699 (this patch exposed a bug from r264551).
>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>> Some other comments (while I'm still trying to make sense of organizeMetadata()):
>>>>>>>
>>>>>>> + struct MDIndex {
>>>>>>>
>>>>>>> one-line doxygen could be welcome
>>>>>>>
>>>>>>> + struct MDRange {
>>>>>>>
>>>>>>> same
>>>>>>>
>>>>>>> - unsigned NumModuleMDs;
>>>>>>> + unsigned NumModuleMDStrings = 0;
>>>>>>> + unsigned NumBaseMDs = 0;
>>>>>>> + unsigned NumMDStrings = 0;
>>>>>>>
>>>>>>> Same
>>>>>>>
>>>>>>>
>>>>>>> /// Purge function metadata.
>>>>>>> void purgeFunctionMetadata();
>>>>>>>
>>>>>>> Doxygen that are 1-1 matching to the function name are useless, either there is something more useful to say, or it could be dropped if the name makes what it is intended for obvious.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> + // Check whether this is a new function.
>>>>>>> + if (Insertion.first->second.updateFunction(F))
>>>>>>> + if (auto *N = dyn_cast<MDNode>(MD))
>>>>>>> + NeedToDropFunctions.push_back(N);
>>>>>>>
>>>>>>> The dyn_cast is not clear to me?
>>>>>>>
>>>>>>>
>>>>>>> + return std::make_tuple(LHS.F, !isa<MDString>(MDs[LHS.ID - 1]), LHS.ID) <
>>>>>>> + std::make_tuple(RHS.F, !isa<MDString>(MDs[RHS.ID - 1]), RHS.ID);
>>>>>>>
>>>>>>> Things like the "- 1" offset that is leaking here seems like a missing abstraction. Not that it seems to be widespread, but could be hidden in a member of MDIndex.
>>>>>>>
>>>>>>>
>>>>>>> + // Return early if nothing happened.
>>>>>>> + if (!Order.back().F && !isa<MDString>(MDs[Order.front().ID - 1]))
>>>>>>> + return;
>>>>>>>
>>>>>>> I'm glad there is a comment, without it I wouldn't know what's going on here!
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Mehdi
>
More information about the llvm-commits
mailing list