[Patch] Similar Function Merging

Stepan Dyatkovskiy stpworld at narod.ru
Thu Jan 30 05:03:42 PST 2014


Hi Tobias and Pranav,

Your patch contains a lot of changes and new features, so it took some 
time to read it.
I looked only at how you determine what to merge. Part of your code that 
does merging itself is kept for some next good day. Hope this day will 
come soon.

1. You replaced type of id_map and seen_values with ValueToValueMapTy - 
could it be presented as separate cosmetic patch?

2. +  if (const AllocaInst *AI = dyn_cast<AllocaInst>(I1))
    +    return AI->getArraySize() == 
cast<AllocaInst>(I2)->getArraySize() &&
    +           AI->getAllocatedType() == 
cast<AllocaInst>(I2)->getAllocatedType();
    The same. Perhaps one more separate patch?

3. +    if (C1->getType()->isAggregateType() || 
C2->getType()->isAggregateType())
    +      return false;
    It is subject of PR17326. It was one of possible ways I proposed 
there. For more details see:
    Here http://llvm.org/bugs/show_bug.cgi?id=17326

4. +    // Skip debug information
    Looks like we can skip it in existing approach as well. One more 
patch, IMHO.

5. + goto differing_instructions;
    ...
    + goto not_mergeable;
    Could you avoid these constructions? Introduce some flag?
    Something like this:

    for (F1I = BB1->begin(), F1E = BB1->end(),
         F2I = BB2->begin(), F2E = BB2->end();
         F1I != F1E && F2I != F2E; ++F1I, ++F2I) {
      bool DifferingInstructions = false;
      ...
        // As an example
        if (isEquivalentOperation(F1I, F2I)) {
          // Proceed to further checks
          assert(F1I->getNumOperands() == F2I->getNumOperands());
          ...
        } else
          DifferingInstructions = true;
      ...
      if (DifferingInstructions) {
        ...
      }

     I think, flags instead of gotos will make your patch more 
comfortable for further reviews.

6. comparePHIs - the same, it could be applied to current version, and 
presented as separated patch.

7. + // FIXME: Const casts!!!
    + if (!seen_values.insert(std::make_pair(V2, const_cast<Value 
*>(V1))).second)
    Could you avoid extra const_cast expressions? If no - could you 
explain why in comments? After 2-5 mins I realized that you need to pass 
seen_values into RemapInstruction method, but comments could really save 
reviewer's time.

8. MergeRegistry class.

8.1. + typedef std::map<Function *, FnComparatorSet> SimilarFnMap;
      HashMap may be? It works well with pointers.

8.2. + void MergeRegistry::insertCandidate(FunctionComparator *Comp) {
      +  FunctionsToMerge.push_back(Comp);
      +  SimilarFunctions[Comp->getF1()].insert(Comp);
      + }
      I traced the usage of SimilarFunctions after this call. Only three 
fields are using in FunctionComparator: F1, F2, SimilarityMetric. Yet, 
there are few indirect uses: DifferingInstructions and 
SelfRefInstructions data for each functions pair, that is used in 
outlineAndMergeFunctions method. Though it looks a bit implicit. And yet 
its all twined with these statements:
      + FunctionComparator *Comp = new FunctionComparator(...)
      ...
      + delete Comp;
      Some of instances died straight in the same place where they were 
created.
      A bit below in this post we come back to this part.

8.3. For existing version of merging you're still using 
FunctionsToCompare (old FnSet), that is filled up with Deferred 
contents. So, may be keep these fields in old place?

8.4. After-all.. Its good to encapsulate things. But it introduces 
additional abstraction level. To be put into single class, things should 
glued with.. kind of unbreakable strong (and proven) dependence from 
each other. Below is the picture of your changes I see by now (fix me if 
I'm wrong):

8.4.1. You have replaced FnSet with FunctionsToCompare (may bee keep the 
name same or similar and just replace its type?)
8.4.2. You have introduced next fields and data (all of them mostly used 
in doDiffMerge and its subcalls):
      * FunctionsToMerge - kind of worklist for similar merging party.
      * SimilarFunctions - the same as FunctionsToMerge, except case it 
is actually is std::set and not std::list. Could two of these fields be 
joined somehow? Since their lifetime matches exactly. They are created 
in the same place, and destroyed as well. By the way std::list is really 
slow, replacing it with SmallVector could make things happier.
      * DifferingInstructions data. Associated with F1 and F2 pair, and 
currently lives in FunctionComparator.
      * SelfRefInstructions data. All the same, associated with F1, F2 pair.
      That's it with new fields. Below modifications I've found.
8.4.3. You have included seen_values and id_map into MergeRegistry. 
Above I proposed to keep them at old place, just for tracking real 
changes. Or present these changes as separated patch.
8.4.4. You have modified runOnModule structure. If you really need it... 
but then, may be try to move it into another patch?
8.4.5. And.. Actually your patch work starts straight after old version. 
Mostly, you need to modify old functionality a bit just to collect 
SimilarFunctions, DifferingInstructions and SelfRefInstructions info. 
What else I forget here?
Hope nothing. So that was brief description of most important changes in 
functions comparison part.

9. According to you patch, you pretended to run -mergefunc by default. 
The pass should be really very stable to be enabled by default, while 
currently your patch fails sometimes (try sqlite3.ll from test-suite 
object dir, you need to run "make all" first). Note, that existing 
version started to demonstrate stable work.

10. My proposals:
10.1. Prepare all features that are not actually "similar merging" as 
separated patches.
10.2. Try to avoid extra abstraction layers. From one side it means 
encapsulation, from another side it could mean mixing quite independent 
things.
10.2.1. I suggest to keep existing fields on the same abstraction layer, 
otherwise present such moves in separated patch. Here I mean:
      * FnSet or (in your case FunctionsToCompare).
      * Deferred list.
      * seen_values
      * id_map
10.2.2. Perhaps move DifferingInstructions and SelfRefInstructions from 
MergeRegistry into MergeFunctions class somehow? Then:
      * You don't need to use "new" and "delete", that are expensive 
operations.
      * Perhaps you could get rid of MergeRegistry on first stage. Yes, 
you could introduce it later as separate patch.

10.2.3. Keep the same runOnModule structure if possible.

11. Summary.
     It looks like your part starts only *after* old pass version 
finished its work. You need some results from old part. And you need to 
update old part, but only a little bit, as I could see.
     Why not to add "-merge-similar-func" stuff at the end of 
runOnModule (you did it already, placing doDiffMerge in the end, but its 
not quite obvious in your patch by now).

     On first stage its important to see "how it works", and how it 
changes current functionality as well, and also how you need to modify 
your functionality if current functionality will be changed.

P.S.:
Merging process itself is still to be reviewed. By the way, in merging 
itself, are all of changes caused by your new approach only? Couldn't 
some of them be proposed as separated features (in separated patches)?

-Stepan

Stepan Dyatkovskiy wrote:
> Hi Pranav,
> Just CCed myself.
>
> Thanks!
> -Stepan
>
> 16.01.2014, 21:38, "Pranav Bhandarkar" <pranavb at codeaurora.org>:
>> Hi,
>>
>> The attached patch enhances the MergeFunctions pass to merge multiple
>> similar (not necessarily identical) functions to achieve code-size
>> benefit. The code implements what we presented at the LLVM developers
>> meeting in Nov last year (http://llvm.org/devmtg/2013-11/#talk3 &
>> http://llvm.org/devmtg/2013-11/slides/Koch-FunctionMerging.pdf)
>>
>> Apologies for posting a large patch and for the delay in posting this.
>>
>> regards,
>> Pranav
>>
>> --
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
>>
>> ,
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits




More information about the llvm-commits mailing list