[LLVMdev] MergeFunctions: reduce complexity to O(log(N))

Tue Jan 28 11:47:18 PST 2014

Hi Stepan,

Sorry for the delay. It's great that you are working on MergeFunctions 
as well and I agree, we should definitely try to combine our efforts to 
improve MergeFunctions.

Just to give you some context, the pass (with the similar function 
merging patch) is already being used in a production setting. From my 
point of view, it would be better if we focus on improving its 
capability of merging functions at this stage rather than on reduction 
of compilation time.

I'll give a brief overview of how our similar function merging algorithm 
works (you can also watch the presentation from the US LLVM conference 
to hear me explain it using an animation).

1. Hash all functions into buckets

In each bucket, separately:

  2. Compare functions pair-wise and determine a
     similarity metric for each pair (%age of equivalent instructions)

  3. Merge identical functions (with similarity = 100%), update call
     sites for those functions.

  4. If the updates of call sites have touched other functions,
     go back to step 2 and re-compare *only* those functions to all
     others in their buckets.

  Finally,

  5. Form groups of similar functions for merging:
     a) Pick most similar pair (A,B)
     b) Find all functions C that are also similar to B but are not
        more similar to some other function D.
     c) Merge A with B and all the C's.
     Repeat until there are no more functions to merge.

As you can see, we need to compute a similarity measure for each pair of 
functions in a bucket. If I understand correctly, your patch reduces 
compile-time by avoiding comparisons of functions that cannot be 
identical. That's exactly what you want to do if you only want to merge 
identical functions. However, because we also need to determine whether 
functions are merely *similar*, I can't see how your idea could be 
applied in that case.

Looking at your statistics, I see that you are considering a very large 
number of functions. I don't know which benchmarks you are using, but I 
doubt that many of these are worth merging. For instance, you rarely 
gain a code size benefit from merging functions with just a few 
instructions. Our patch takes care of this using thresholds. It's worth 
looking at actual code size reductions, rather than just numbers of 
functions merged/ compilation time spent comparing functions. It turns 
out that the number of functions in each bucket is really quite small 
once you apply some heuristics (and could still be further reduced!).

Given your experience with MergeFunctions, it would be really great if 
you could review our patch and also try it out on your benchmarks.

Tobias

On 24/01/2014 19:11, Stepan Dyatkovskiy wrote:
> Hi Tobias.
>
> So, what do you think?
>
> If it means to much extra-job for your team, may be I can help you
> somehow? I really would like to.
>
> -Stepan
>
> Stepan Dyatkovskiy wrote:
>> Hi Tobias,
>>
>>> I can't really see a way to combine our approach with your patch. What
>>> are your thoughts?
>>
>> I think it is possible. Even more - we need to combine our efforts, in
>> order to bring this pass into real live.
>> I'have read your presentation file, and unfortunately read your patch
>> only a little.
>> How exactly you scan functions on 2nd stage? Could you explain the
>> algorithm in few words, how you compare workflow? Is it possible to
>> scan binary tree instead of hash table? ...OK.
>>
>> That's how I see the modification. Now its important to explain idea,
>> so consider the simplest case: you need to catch two functions that
>> are differs with single instruction somewhere.
>>
>> 1. Imagine, that IR *module* contents is represented as binary tree:
>> Each line (trace) from root to terminal node is a function.
>> Each node - is function's primitive (instruction opcode, for example).
>> Down - is direction to terminal nodes, up - is direction to the root.
>>
>> 2. Now you are standing on some node. And you have two directions
>> down-left and down-right.
>>
>> 3. If you are able to find two equal sub-traces down (at left and at
>> right), then the only difference lies in this node. Then we catch that
>> case.
>>
>> 4. Even more, if two subtrees at left and at right are equal, than you
>> catch at once all the cases that are represented by these subtrees.
>>
>> I still didn't look at you patch carefully. Sorry.. But I hope that
>> helps, and I'll try look at it in nearest time and perhaps its not the
>> best solution I gave in this post.
>>
>> -Stepan
>>
>> 22.01.2014, 20:53, "Tobias von Koch" <tobias.von.koch at gmail.com>:
>>> Hi Stepan,
>>>
>>> As you've seen we have recently implemented a significant enhancement to
>>> the MergeFunctions pass that also allows merging of functions that are
>>> only similar but not identical
>>> (http://llvm-reviews.chandlerc.com/D2591).
>>>
>>> Our patch also changes the way that functions are compared quite
>>> significantly. This is necessary because we need to compare all
>>> functions in a bucket in order to get a similarity measure for each
>>> pair, so we can then decide which 'groups' of functions to merge.
>>>
>>> I can't really see a way to combine our approach with your patch. What
>>> are your thoughts?
>>>
>>> Another way to improve the performance of MergeFunctions might be to
>>> make the hash function better. I've already added the size of the first
>>> BB to it, and perhaps there are other factors we could try... if we
>>> don't have to compare functions in the first place (because they're in
>>> different buckets) then that's obviously a much bigger win.
>>>
>>> Thanks,
>>> Tobias
>>>
>>> On 17/01/2014 20:25, Stepan Dyatkovskiy wrote:
>>>
>>>>   Hi all,
>>>>
>>>>   I propose simple improvement for MergeFunctions pass, that reduced
>>>> its
>>>>   complexity from O(N^2) to O(log(N)), where N is number of
>>>> functions in
>>>>   module.
>>>>
>>>>   The idea, is to replace the result of comparison from "bool" to
>>>>   "-1,0,1". In another words: define order relation on functions set.
>>>>   To be sure, that functions could be comparable that way, we have to
>>>>   prove that order relation is possible on all comparison stage.
>>>>
>>>>   The last one is possible, if for each comparison stage we
>>>> implement has
>>>>   next properties:
>>>>   * reflexivity (a <= a, a == a, a >= a),
>>>>   * antisymmetry (if a <= b and b <= a then a  == b),
>>>>   * transitivity (a <= b and b <= c, then a <= c)
>>>>   * asymmetry (if a < b, then a > b or a == b).
>>>>
>>>>   Once we have defined order relation we can store all the functions in
>>>>   binary tree and perform lookup in O(log(N)) time.
>>>>
>>>>   This post has two attachments:
>>>>   1. The patch, that has implementation of this idea.
>>>>   2. The MergeFunctions pass detailed description, with explanation how
>>>>   order relation could be possible.
>>>>
>>>>   Hope it helps to make things better!
>>>>
>>>>   -Stepan.
>>>>
>>>>   _______________________________________________
>>>>   LLVM Developers mailing list
>>>>   LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>   http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>