[LLVMdev] MergeFunctions: reduce complexity to O(log(N))
Stepan Dyatkovskiy
stpworld at narod.ru
Tue Jan 28 22:30:29 PST 2014
Hi Tobias,
> As you can see, we need to compute a similarity measure for each pair of
> functions in a bucket. If I understand correctly, your patch reduces
> compile-time by avoiding comparisons of functions that cannot be
> identical. That's exactly what you want to do if you only want to merge
> identical functions. However, because we also need to determine whether
> functions are merely *similar*, I can't see how your idea could be
> applied in that case.
Did you read my previous post to you? What do you thing about tree advantages for your approach that were explained there?
> Looking at your statistics, I see that you are considering a very large
> number of functions. I don't know which benchmarks you are using, but I
> doubt that many of these are worth merging. For instance, you rarely
> gain a code size benefit from merging functions with just a few
> instructions. Our patch takes care of this using thresholds. It's worth
> looking at actual code size reductions, rather than just numbers of
> functions merged/ compilation time spent comparing functions. It turns
> out that the number of functions in each bucket is really quite small
> once you apply some heuristics (and could still be further reduced!).
Based on my previous statistics average size of merging function is about 10-20 instructions. Anyway I will present new statistics with sizes included. I also will publish statistics for you approach.
--
Truly yours,
Stepan Dyatkovskiy
28.01.2014, 23:47, "Tobias von Koch" <tobias.von.koch at gmail.com>:
> Hi Stepan,
>
> Sorry for the delay. It's great that you are working on MergeFunctions
> as well and I agree, we should definitely try to combine our efforts to
> improve MergeFunctions.
>
> Just to give you some context, the pass (with the similar function
> merging patch) is already being used in a production setting. From my
> point of view, it would be better if we focus on improving its
> capability of merging functions at this stage rather than on reduction
> of compilation time.
>
> I'll give a brief overview of how our similar function merging algorithm
> works (you can also watch the presentation from the US LLVM conference
> to hear me explain it using an animation).
>
> 1. Hash all functions into buckets
>
> In each bucket, separately:
>
> 2. Compare functions pair-wise and determine a
> similarity metric for each pair (%age of equivalent instructions)
>
> 3. Merge identical functions (with similarity = 100%), update call
> sites for those functions.
>
> 4. If the updates of call sites have touched other functions,
> go back to step 2 and re-compare *only* those functions to all
> others in their buckets.
>
> Finally,
>
> 5. Form groups of similar functions for merging:
> a) Pick most similar pair (A,B)
> b) Find all functions C that are also similar to B but are not
> more similar to some other function D.
> c) Merge A with B and all the C's.
> Repeat until there are no more functions to merge.
>
> As you can see, we need to compute a similarity measure for each pair of
> functions in a bucket. If I understand correctly, your patch reduces
> compile-time by avoiding comparisons of functions that cannot be
> identical. That's exactly what you want to do if you only want to merge
> identical functions. However, because we also need to determine whether
> functions are merely *similar*, I can't see how your idea could be
> applied in that case.
>
> Looking at your statistics, I see that you are considering a very large
> number of functions. I don't know which benchmarks you are using, but I
> doubt that many of these are worth merging. For instance, you rarely
> gain a code size benefit from merging functions with just a few
> instructions. Our patch takes care of this using thresholds. It's worth
> looking at actual code size reductions, rather than just numbers of
> functions merged/ compilation time spent comparing functions. It turns
> out that the number of functions in each bucket is really quite small
> once you apply some heuristics (and could still be further reduced!).
>
> Given your experience with MergeFunctions, it would be really great if
> you could review our patch and also try it out on your benchmarks.
>
> Tobias
>
> On 24/01/2014 19:11, Stepan Dyatkovskiy wrote:
>
>> Hi Tobias.
>>
>> So, what do you think?
>>
>> If it means to much extra-job for your team, may be I can help you
>> somehow? I really would like to.
>>
>> -Stepan
>>
>> Stepan Dyatkovskiy wrote:
>>> Hi Tobias,
>>>> I can't really see a way to combine our approach with your patch. What
>>>> are your thoughts?
>>> I think it is possible. Even more - we need to combine our efforts, in
>>> order to bring this pass into real live.
>>> I'have read your presentation file, and unfortunately read your patch
>>> only a little.
>>> How exactly you scan functions on 2nd stage? Could you explain the
>>> algorithm in few words, how you compare workflow? Is it possible to
>>> scan binary tree instead of hash table? ...OK.
>>>
>>> That's how I see the modification. Now its important to explain idea,
>>> so consider the simplest case: you need to catch two functions that
>>> are differs with single instruction somewhere.
>>>
>>> 1. Imagine, that IR *module* contents is represented as binary tree:
>>> Each line (trace) from root to terminal node is a function.
>>> Each node - is function's primitive (instruction opcode, for example).
>>> Down - is direction to terminal nodes, up - is direction to the root.
>>>
>>> 2. Now you are standing on some node. And you have two directions
>>> down-left and down-right.
>>>
>>> 3. If you are able to find two equal sub-traces down (at left and at
>>> right), then the only difference lies in this node. Then we catch that
>>> case.
>>>
>>> 4. Even more, if two subtrees at left and at right are equal, than you
>>> catch at once all the cases that are represented by these subtrees.
>>>
>>> I still didn't look at you patch carefully. Sorry.. But I hope that
>>> helps, and I'll try look at it in nearest time and perhaps its not the
>>> best solution I gave in this post.
>>>
>>> -Stepan
>>>
>>> 22.01.2014, 20:53, "Tobias von Koch" <tobias.von.koch at gmail.com>:
>>>> Hi Stepan,
>>>>
>>>> As you've seen we have recently implemented a significant enhancement to
>>>> the MergeFunctions pass that also allows merging of functions that are
>>>> only similar but not identical
>>>> (http://llvm-reviews.chandlerc.com/D2591).
>>>>
>>>> Our patch also changes the way that functions are compared quite
>>>> significantly. This is necessary because we need to compare all
>>>> functions in a bucket in order to get a similarity measure for each
>>>> pair, so we can then decide which 'groups' of functions to merge.
>>>>
>>>> I can't really see a way to combine our approach with your patch. What
>>>> are your thoughts?
>>>>
>>>> Another way to improve the performance of MergeFunctions might be to
>>>> make the hash function better. I've already added the size of the first
>>>> BB to it, and perhaps there are other factors we could try... if we
>>>> don't have to compare functions in the first place (because they're in
>>>> different buckets) then that's obviously a much bigger win.
>>>>
>>>> Thanks,
>>>> Tobias
>>>>
>>>> On 17/01/2014 20:25, Stepan Dyatkovskiy wrote:
>>>>> Hi all,
>>>>>
>>>>> I propose simple improvement for MergeFunctions pass, that reduced
>>>>> its
>>>>> complexity from O(N^2) to O(log(N)), where N is number of
>>>>> functions in
>>>>> module.
>>>>>
>>>>> The idea, is to replace the result of comparison from "bool" to
>>>>> "-1,0,1". In another words: define order relation on functions set.
>>>>> To be sure, that functions could be comparable that way, we have to
>>>>> prove that order relation is possible on all comparison stage.
>>>>>
>>>>> The last one is possible, if for each comparison stage we
>>>>> implement has
>>>>> next properties:
>>>>> * reflexivity (a <= a, a == a, a >= a),
>>>>> * antisymmetry (if a <= b and b <= a then a == b),
>>>>> * transitivity (a <= b and b <= c, then a <= c)
>>>>> * asymmetry (if a < b, then a > b or a == b).
>>>>>
>>>>> Once we have defined order relation we can store all the functions in
>>>>> binary tree and perform lookup in O(log(N)) time.
>>>>>
>>>>> This post has two attachments:
>>>>> 1. The patch, that has implementation of this idea.
>>>>> 2. The MergeFunctions pass detailed description, with explanation how
>>>>> order relation could be possible.
>>>>>
>>>>> Hope it helps to make things better!
>>>>>
>>>>> -Stepan.
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
More information about the llvm-dev
mailing list