[LLVMdev] MergeFunctions: reduce complexity to O(log(N))

Tue Jan 28 22:30:29 PST 2014

Hi Tobias,

> As you can see, we need to compute a similarity measure for each pair of
> functions in a bucket. If I understand correctly, your patch reduces
> compile-time by avoiding comparisons of functions that cannot be
> identical. That's exactly what you want to do if you only want to merge
> identical functions. However, because we also need to determine whether
> functions are merely *similar*, I can't see how your idea could be
> applied in that case.
Did you read my previous post to you? What do you thing about tree advantages for your approach that were explained there?

> Looking at your statistics, I see that you are considering a very large
> number of functions. I don't know which benchmarks you are using, but I
> doubt that many of these are worth merging. For instance, you rarely
> gain a code size benefit from merging functions with just a few
> instructions. Our patch takes care of this using thresholds. It's worth
> looking at actual code size reductions, rather than just numbers of
> functions merged/ compilation time spent comparing functions. It turns
> out that the number of functions in each bucket is really quite small
> once you apply some heuristics (and could still be further reduced!).
Based on my previous statistics average size of merging function is about 10-20 instructions. Anyway I will present new statistics with sizes included. I also will publish statistics for you approach.

-- 
Truly yours,
Stepan Dyatkovskiy

28.01.2014, 23:47, "Tobias von Koch" <tobias.von.koch at gmail.com>:
> Hi Stepan,
>
> Sorry for the delay. It's great that you are working on MergeFunctions
> as well and I agree, we should definitely try to combine our efforts to
> improve MergeFunctions.
>
> Just to give you some context, the pass (with the similar function
> merging patch) is already being used in a production setting. From my
> point of view, it would be better if we focus on improving its
> capability of merging functions at this stage rather than on reduction
> of compilation time.
>
> I'll give a brief overview of how our similar function merging algorithm
> works (you can also watch the presentation from the US LLVM conference
> to hear me explain it using an animation).
>
> 1. Hash all functions into buckets
>
> In each bucket, separately:
>
>   2. Compare functions pair-wise and determine a
>      similarity metric for each pair (%age of equivalent instructions)
>
>   3. Merge identical functions (with similarity = 100%), update call
>      sites for those functions.
>
>   4. If the updates of call sites have touched other functions,
>      go back to step 2 and re-compare *only* those functions to all
>      others in their buckets.
>
>   Finally,
>
>   5. Form groups of similar functions for merging:
>      a) Pick most similar pair (A,B)
>      b) Find all functions C that are also similar to B but are not
>         more similar to some other function D.
>      c) Merge A with B and all the C's.
>      Repeat until there are no more functions to merge.
>
> As you can see, we need to compute a similarity measure for each pair of
> functions in a bucket. If I understand correctly, your patch reduces
> compile-time by avoiding comparisons of functions that cannot be
> identical. That's exactly what you want to do if you only want to merge
> identical functions. However, because we also need to determine whether
> functions are merely *similar*, I can't see how your idea could be
> applied in that case.
>
> Looking at your statistics, I see that you are considering a very large
> number of functions. I don't know which benchmarks you are using, but I
> doubt that many of these are worth merging. For instance, you rarely
> gain a code size benefit from merging functions with just a few
> instructions. Our patch takes care of this using thresholds. It's worth
> looking at actual code size reductions, rather than just numbers of
> functions merged/ compilation time spent comparing functions. It turns
> out that the number of functions in each bucket is really quite small
> once you apply some heuristics (and could still be further reduced!).
>
> Given your experience with MergeFunctions, it would be really great if
> you could review our patch and also try it out on your benchmarks.
>
> Tobias
>
> On 24/01/2014 19:11, Stepan Dyatkovskiy wrote:
>
>>  Hi Tobias.
>>
>>  So, what do you think?
>>
>>  If it means to much extra-job for your team, may be I can help you
>>  somehow? I really would like to.
>>
>>  -Stepan
>>
>>  Stepan Dyatkovskiy wrote:
>>>  Hi Tobias,
>>>>  I can't really see a way to combine our approach with your patch. What
>>>>  are your thoughts?
>>>  I think it is possible. Even more - we need to combine our efforts, in
>>>  order to bring this pass into real live.
>>>  I'have read your presentation file, and unfortunately read your patch
>>>  only a little.
>>>  How exactly you scan functions on 2nd stage? Could you explain the
>>>  algorithm in few words, how you compare workflow? Is it possible to
>>>  scan binary tree instead of hash table? ...OK.
>>>
>>>  That's how I see the modification. Now its important to explain idea,
>>>  so consider the simplest case: you need to catch two functions that
>>>  are differs with single instruction somewhere.
>>>
>>>  1. Imagine, that IR *module* contents is represented as binary tree:
>>>  Each line (trace) from root to terminal node is a function.
>>>  Each node - is function's primitive (instruction opcode, for example).
>>>  Down - is direction to terminal nodes, up - is direction to the root.
>>>
>>>  2. Now you are standing on some node. And you have two directions
>>>  down-left and down-right.
>>>
>>>  3. If you are able to find two equal sub-traces down (at left and at
>>>  right), then the only difference lies in this node. Then we catch that
>>>  case.
>>>
>>>  4. Even more, if two subtrees at left and at right are equal, than you
>>>  catch at once all the cases that are represented by these subtrees.
>>>
>>>  I still didn't look at you patch carefully. Sorry.. But I hope that
>>>  helps, and I'll try look at it in nearest time and perhaps its not the
>>>  best solution I gave in this post.
>>>
>>>  -Stepan
>>>
>>>  22.01.2014, 20:53, "Tobias von Koch" <tobias.von.koch at gmail.com>:
>>>>  Hi Stepan,
>>>>
>>>>  As you've seen we have recently implemented a significant enhancement to
>>>>  the MergeFunctions pass that also allows merging of functions that are
>>>>  only similar but not identical
>>>>  (http://llvm-reviews.chandlerc.com/D2591).
>>>>
>>>>  Our patch also changes the way that functions are compared quite
>>>>  significantly. This is necessary because we need to compare all
>>>>  functions in a bucket in order to get a similarity measure for each
>>>>  pair, so we can then decide which 'groups' of functions to merge.
>>>>
>>>>  I can't really see a way to combine our approach with your patch. What
>>>>  are your thoughts?
>>>>
>>>>  Another way to improve the performance of MergeFunctions might be to
>>>>  make the hash function better. I've already added the size of the first
>>>>  BB to it, and perhaps there are other factors we could try... if we
>>>>  don't have to compare functions in the first place (because they're in
>>>>  different buckets) then that's obviously a much bigger win.
>>>>
>>>>  Thanks,
>>>>  Tobias
>>>>
>>>>  On 17/01/2014 20:25, Stepan Dyatkovskiy wrote:
>>>>>    Hi all,
>>>>>
>>>>>    I propose simple improvement for MergeFunctions pass, that reduced
>>>>>  its
>>>>>    complexity from O(N^2) to O(log(N)), where N is number of
>>>>>  functions in
>>>>>    module.
>>>>>
>>>>>    The idea, is to replace the result of comparison from "bool" to
>>>>>    "-1,0,1". In another words: define order relation on functions set.
>>>>>    To be sure, that functions could be comparable that way, we have to
>>>>>    prove that order relation is possible on all comparison stage.
>>>>>
>>>>>    The last one is possible, if for each comparison stage we
>>>>>  implement has
>>>>>    next properties:
>>>>>    * reflexivity (a <= a, a == a, a >= a),
>>>>>    * antisymmetry (if a <= b and b <= a then a  == b),
>>>>>    * transitivity (a <= b and b <= c, then a <= c)
>>>>>    * asymmetry (if a < b, then a > b or a == b).
>>>>>
>>>>>    Once we have defined order relation we can store all the functions in
>>>>>    binary tree and perform lookup in O(log(N)) time.
>>>>>
>>>>>    This post has two attachments:
>>>>>    1. The patch, that has implementation of this idea.
>>>>>    2. The MergeFunctions pass detailed description, with explanation how
>>>>>    order relation could be possible.
>>>>>
>>>>>    Hope it helps to make things better!
>>>>>
>>>>>    -Stepan.
>>>>>
>>>>>    _______________________________________________
>>>>>    LLVM Developers mailing list
>>>>>    LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>    http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev