[PATCH] Bitcode: Try to emit metadata in function blocks

Mehdi Amini via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 1 19:32:15 PDT 2016


LGTM.

-- 
Mehdi

> On Apr 1, 2016, at 6:51 AM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> 
>> 
>> On 2016-Mar-31, at 08:03, Teresa Johnson <tejohnson at google.com> wrote:
>> 
>> Hi Duncan,
>> 
>> I just scanned it briefly right now - do you have an updated version of the patch addressing Mehdi's comments? One minor comment so far:
>> 
>> +void ValueEnumerator::purgeFunctionMetadata() {
>> +  while (MDs.size() > NumBaseMDs) {
>> +    //MetadataMap.erase(MDs.back());
>> 
>> Commented line?
> 
> Nice catch; that was from some debugging I was doing on an earlier
> version.
> 
> Mehdi, you already gave me an LGTM, but when I was responding to
> your feedback I made some larger changes.  I wouldn't mind another
> check through from someone before commit.
> 
>  - The code to drop function tags from metadata operands is no
>    longer (co-)recursive; instead it uses a worklist.  See
>    dropFunctionFromOps.
> 
>    (Actually, I wonder why the rest of ValueEnumerator doesn't
>    use a worklist approach.  Maybe I'll come back and clean that
>    up later (or someone else could do it).)
> 
>  - dropFunctionFromOps is called immediately on detecting that a
>    function tag should be dropped, instead of delaying the call
>    to the end.
> 
>  - Minimized unnecessary changes (which should be done
>    separately if at all).
> 
> I also realized the BitcodeReader changes can/should be committed
> separately as a prep.  I'll split them out before I commit.
> 
> <0001-Bitcode-Try-to-emit-metadata-in-function-blocks-v5.patch>
>> 
>> Thanks,
>> Teresa
>> 
>> On Tue, Mar 29, 2016 at 10:23 PM, Mehdi Amini via llvm-commits <llvm-commits at lists.llvm.org> wrote:
>> 
>>> On Mar 28, 2016, at 3:42 PM, Mehdi Amini <mehdi.amini at apple.com> wrote:
>>> 
>>>> 
>>>> On Mar 28, 2016, at 7:44 AM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
>>>> 
>>>> 
>>>>> On 2016-Mar-27, at 17:41, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
>>>>> 
>>>>> Whenever metadata is only referenced by a single function, emit the
>>>>> metadata just in that function block.  This should improve lazy-loading
>>>>> by reducing the amount of metadata in the global block.
>>>>> 
>>>>> For now, this should catch all DILocations, and anything else that
>>>>> happens to be referenced only by a single function.
>>>>> 
>>>>> It's also a first step toward a couple of possible future directions
>>>>> (which this commit does *not* implement):
>>>>> 
>>>>> 1. Some debug info metadata is only referenced from compile units and
>>>>> individual functions.  If we can drop the link from the compile
>>>>> unit, this optimization will get more powerful.
>>>>> 
>>>>> 2. Any uniqued metadata that isn't referenced globally can in theory be
>>>>> emitted in every function block that references it (trading off
>>>>> bitcode size and full-parse time vs. lazy-load time).
>>>>> 
>>>>> Note: the only change on the reader side is cautious error checking.
>>>>> The metadata stored in function blocks gets purged after parsing each
>>>>> function, which means unresolved forward references will get lost.
>>>>> Since all the global metadata should have already been resolved by the
>>>>> time we get to the function metadata blocks we just need to check for
>>>>> that case.  (If for some reason we need this, the fix is to store
>>>>> about-to-be-dropped unresolved nodes in MetadataList::shrinkTo until
>>>>> they can be handled succesfully by a future call to
>>>>> MetadataList::tryToResolveCycles.)
>>>>> 
>>>>> <0001-Bitcode-Try-to-emit-metadata-in-function-blocks-v3.patch>
>>>> 
>>>> I realized that it's straightforward to use llvm-bcanalyzer to test that
>>>> this works correctly.  Attaching a new version of the patch with a
>>>> testcase.
>>> 
>>> 
>>> I wanted to test it locally, but can't ThinLTO-link llvm-tblgen, it crashes with "error: Invalid record: metadata strings corrupt offset"
>>> 
>>> Some other comments (while I'm still trying to make sense of organizeMetadata()):
>>> 
>>> +  struct MDIndex {
>>> 
>>> one-line doxygen could be welcome
>>> 
>>> +  struct MDRange {
>>> 
>>> same
>>> 
>>> -  unsigned NumModuleMDs;
>>> +  unsigned NumModuleMDStrings = 0;
>>> +  unsigned NumBaseMDs = 0;
>>> +  unsigned NumMDStrings = 0;
>>> 
>>> Same
> 
> I've minimized these to the ones actually needed; at this point I think
> the names and are self-explanatory (my comment wouldn't add much).
> 
>>> 
>>> 
>>> /// Purge function metadata.
>>> void purgeFunctionMetadata();
>>> 
>>> Doxygen that are 1-1 matching to the function name are useless, either there is something more useful to say, or it could be dropped if the name makes what it is intended for obvious.
>>> 
>>> 
>>> 
>>> +  // Check whether this is a new function.
>>> +  if (Insertion.first->second.updateFunction(F))
>>> +    if (auto *N = dyn_cast<MDNode>(MD))
>>> +      NeedToDropFunctions.push_back(N);
>>> 
>>> The dyn_cast is not clear to me?
>> 
> 
> Should be more clear in the new patch (it was adding a comment
> here that triggered the cleanup of this algorithm).  But to
> answer the question: updateFunction was already dropping the
> function-tag from "MD"; only MDNodes needed to be queued in
> NeedToDropFunctions because only MDNodes have operands.
> 
>>> 
>>> +    return std::make_tuple(LHS.F, !isa<MDString>(MDs[LHS.ID - 1]), LHS.ID) <
>>> +          std::make_tuple(RHS.F, !isa<MDString>(MDs[RHS.ID - 1]), RHS.ID);
>>> 
>>> Things like the "- 1" offset that is leaking here seems like a missing abstraction. Not that it seems to be widespread, but could be hidden in a member of MDIndex.
> 
> Added MDIndex::get.
> 
>>> 
>>> +  // Return early if nothing happened.
>>> +  if (!Order.back().F && !isa<MDString>(MDs[Order.front().ID - 1]))
>>> +    return;
>>> 
>>> I'm glad there is a comment, without it I wouldn't know what's going on here!
>> 
>> 
>> 
>> I don't have much to add. The reader/writer part is trivial, there is only the ValueEnumerator which might now be intuitive on the first sight. Just a good high level description of what is achieved with the two struct (MDindex and MDRang) should be enough.
>> 
>> So LGTM.
>> 
>> 
>> 
>> --
>> Mehdi
>> 
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>> 
>> 
>> 
>> -- 
>> Teresa Johnson |	 Software Engineer |	 tejohnson at google.com |	 408-460-2413



More information about the llvm-commits mailing list