[LLVMdev] MergeFunctions: reduce complexity to O(log(N))
Stepan Dyatkovskiy
stpworld at narod.ru
Mon Feb 17 06:34:34 PST 2014
ping
Stepan Dyatkovskiy wrote:
> ping
> Stepan Dyatkovskiy wrote:
>> Hi all,
>>
>> Previous patch has been split onto series of small changes.
>> On each stage (after each patch) MergeFunctions pass is compilable and
>> stable.
>>
>> Please find patches in attachment for review.
>>
>> To apply all patches at once, use "apply-patches.sh" script as follows:
>> 0. Place "apply-patches.sh" in same directory with patches.
>> 1. cd <llvm-sources-dir>
>> 2. "bash <path-to-patches-dir>/apply-patches.sh"
>>
>> -Stepan
>>
>> Stepan Dyatkovskiy wrote:
>>> Hi Nick,
>>> About partition refinement.
>>>
>>> It looks like patch proposed here allows to apply your idea with
>>> painless way (as series of small changes actually).
>>>
>>> If I got it right, you would like to introduce the next
>>> partition-refinement.
>>>
>>> Initial partition P consists of only one functions class X0 (whole
>>> module, single bucket). Suppose S is subset of X0, so S represents
>>> functions with some trait that is absent in rest part of X0.
>>>
>>> Then refinement looks like below:
>>>
>>> P`=refine(P, S):
>>> X`[0] = intersection(X[0], S)
>>> X`[1] = difference(X[0], S) // X0 without S0
>>>
>>> so P`={X`[0], X`[1]}
>>>
>>> And actually X`[0] and X`[1] are classes of functions that differs with
>>> trait defined by S.
>>>
>>> Then (suppose, we need to introduce new trait, so it would be S`):
>>>
>>> P``=refine(P`, S`), so each X`` could be defined like below:
>>> for each X`[i] in P`:
>>> X``[i*2] = intersection(X`[i], S`)
>>> X``[i*2+1] = difference(X`[i], S`)
>>>
>>> and so on, until all the traits are over.
>>>
>>> Finally we got the classes (set of Xs) that consists of equivalent
>>> functions. So, we can merge functions inside the class bounds.
>>>
>>> Am I right?
>>>
>>> Patch, that was proposed here could be represented as such partition
>>> refinement then:
>>>
>>> On each refinement stage, you need to introduce S somehow, the algorithm
>>> that answers the question: "what sort of difference we want to check in
>>> this bucket?"
>>>
>>> In existing implementation we check traits already, but pair-wise
>>> (attributes, calling convention, number of arguments, number of basic
>>> blocks, etc.). How to to use the same traits in partition refinement?
>>>
>>> Consider next example. Suppose we want to split some bucket X:
>>>
>>> 0. Take some trait from *pair-wise* comparison. For example: number of
>>> arguments.
>>> 1. Take some function from bucket X, let it be F1. Let F1 has N1 number
>>> of arguments.
>>> 2. Then S are functions with number of arguments *less* than N1.
>>> 3. Do refinement by S. As result we got X`[0] and X`[1].
>>> 4. If both of X`[0] and X`[1] are not empty, then repeat #1-#3 for them.
>>> 5. If some of X`[0] of X`[1] is empty set, then all functions in X are
>>> the same in context of trait S. We need another trait, go to #0. And
>>> repeat #0-#5 for all X` buckets we have.
>>>
>>> Finally we got the groups of equivalent functions. And actually these
>>> groups would be organized in binary tree.
>>> Most of stuff is implemented in std::map. I showed it as #0-#5 just to
>>> put it into context of your idea. The only problem was to introduce way
>>> of how to define order-relation (or how to define S). But this problem
>>> has been solved.
>>> And on output we got groups of equal functions.
>>>
>>> Fix me if I'm wrong. But it looks like every advantage given by
>>> partition refinement is present here.
>>>
>>> As I said in beginning, it is way of how to commit things you proposed
>>> with minimum of changes. We will use old comparison routines, only thing
>>> we need to change is return type (from bool to -1,0,1).
>>> In this way, partition refinement could be committed as series of small
>>> changes: one comparison routine per commit.
>>>
>>> Every change in MergeFunctions from now could be well checked by this
>>> builder:
>>> Now we have special builder for MergeFunctions. It runs test-suite with
>>> forced MergeFunctions pass:
>>> http://lab.llvm.org:8011/builders/clang-mergefunc-x86_64-freeBSD9.2
>>> Currently, MergeFunctions pass is stable, builder is green and passes
>>> all test-suite programs.
>>>
>>> -Stepan.
>>>
>>> Stepan Dyatkovskiy wrote:
>>>> Hi all,
>>>> Please find the updated patch in attachment:
>>>> * Added some comments.
>>>> * Fixed some typos.
>>>>
>>>> -Stepan
>>>>
>>>> Nick Lewycky wrote:
>>>>> On 30 January 2014 01:27, Stepan Dyatkovskiy <stpworld at narod.ru
>>>>> <mailto:stpworld at narod.ru>> wrote:
>>>>>
>>>>> Hello Sean and Tobias,
>>>>>
>>>>> Sean,
>>>>> Thank you. Could you describe Nick's ideas in few words or give me
>>>>> links to your discussion, so I could adapt my ideas to it.
>>>>>
>>>>>
>>>>> Sorry I haven't had time to write it up. The essential idea is that we
>>>>> use partition-refinement to determine the differences between a
>>>>> group of
>>>>> functions instead of using pair-wise comparison.
>>>>>
>>>>> Imagine you want to compare long sequences of numbers. You can grab a
>>>>> pair of sequences and run down them until they differ or you reach the
>>>>> end, and move on to another pair of sequences. With partition
>>>>> refinement
>>>>> you start with the first number in all the sequences and bucket them.
>>>>> Say, three 1's, two 2's, and an 8. You've just formed 3 buckets. For
>>>>> each bucket (with more than 1 member), do it again with the next
>>>>> number
>>>>> in the sequence. This gets you an equivalence comparison at less than
>>>>> the complexity of pairwise comparisons. (The buckets are partitions,
>>>>> and
>>>>> you're refining them as you read more data. Partition refinement is
>>>>> the
>>>>> dual of union-find, which we use quite heavily in the compiler.)
>>>>>
>>>>> I haven't figured out how this stacks up against Stepan's sorting with
>>>>> <=> comparison. I can probably do that, I just haven't spent the time
>>>>> thinking about it.
>>>>>
>>>>> I also haven't thought through how to integrate it with "near miss"
>>>>> comparison. My current leading idea is that you do the same thing, but
>>>>> make a record of what makes two buckets similar. (Note that the wrong
>>>>> solution is to store a per-function-pair list of
>>>>> similarities/differences.)
>>>>>
>>>>> Nick
>>>>>
>>>>> Tobias,
>>>>> Your patch fails on several modules in my benchmark (73 of ~1800
>>>>> tests). I have sent one as attachment.
>>>>>
>>>>> See statistics files for more details, all the .ll files you could
>>>>> simply find in test-suite object directory (after "make
>>>>> TEST=nightly
>>>>> report").
>>>>> I have attached script that launches benchmark. Example of usage
>>>>> (you need Linux OS):
>>>>> bash test-scipt.sh report.txt opt-no-patch opt-with-patch
>>>>>
>>>>> Some numbers.
>>>>>
>>>>> Total number of functions: 52715
>>>>>
>>>>> Below, by default, I have excluded modules where your patch fails.
>>>>>
>>>>> Functions merged
>>>>> Original version: 1113
>>>>> Order relation patch: 1112
>>>>> Similar merging patch: 541
>>>>>
>>>>> Functions merged (including failed tests)
>>>>> Original version: 8398
>>>>> Order relation patch: 8395
>>>>>
>>>>> Summary files size
>>>>> Initial: 163595634 bytes.
>>>>> After merging with order relation patch: 163147731 bytes.
>>>>> After similar merging patch: 162491905 bytes.
>>>>>
>>>>> Summary files size (including failed tests)
>>>>> Initial: 250242266 bytes.
>>>>> Original version: 247168198 bytes.
>>>>> Order relation: 247175650 bytes.
>>>>>
>>>>> Time. I measured with "time" utility, and used "real (wall clock)
>>>>> time used by the process".
>>>>>
>>>>> Summary time spent
>>>>> With existing version: 28.05 secs.
>>>>> With order relation patch: 28.19 secs.
>>>>> Similar merging patch: 28.61 secs.
>>>>>
>>>>> Summary time spent (including failed tests)
>>>>> With existing version: 41.74 secs.
>>>>> With order relation patch: 36.35 secs.
>>>>>
>>>>> -Stepan
>>>>>
>>>>> Sean Silva wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Jan 28, 2014 at 2:47 PM, Tobias von Koch
>>>>> <tobias.von.koch at gmail.com <mailto:tobias.von.koch at gmail.com>
>>>>> <mailto:tobias.von.koch at gmail.__com
>>>>> <mailto:tobias.von.koch at gmail.com>>> wrote:
>>>>>
>>>>> Hi Stepan,
>>>>>
>>>>> Sorry for the delay. It's great that you are working on
>>>>> MergeFunctions as well and I agree, we should definitely
>>>>> try to
>>>>> combine our efforts to improve MergeFunctions.
>>>>>
>>>>> Just to give you some context, the pass (with the similar
>>>>> function
>>>>> merging patch) is already being used in a production
>>>>> setting. From
>>>>> my point of view, it would be better if we focus on
>>>>> improving its
>>>>> capability of merging functions at this stage rather
>>>>> than on
>>>>> reduction of compilation time.
>>>>>
>>>>> I'll give a brief overview of how our similar function
>>>>> merging
>>>>> algorithm works (you can also watch the presentation from
>>>>> the US
>>>>> LLVM conference to hear me explain it using an
>>>>> animation).
>>>>>
>>>>> 1. Hash all functions into buckets
>>>>>
>>>>> In each bucket, separately:
>>>>>
>>>>> 2. Compare functions pair-wise and determine a
>>>>> similarity metric for each pair (%age of equivalent
>>>>> instructions)
>>>>>
>>>>> 3. Merge identical functions (with similarity = 100%),
>>>>> update call
>>>>> sites for those functions.
>>>>>
>>>>> 4. If the updates of call sites have touched other
>>>>> functions,
>>>>> go back to step 2 and re-compare *only* those
>>>>> functions to all
>>>>> others in their buckets.
>>>>>
>>>>> Finally,
>>>>>
>>>>> 5. Form groups of similar functions for merging:
>>>>> a) Pick most similar pair (A,B)
>>>>> b) Find all functions C that are also similar to B
>>>>> but
>>>>> are not
>>>>> more similar to some other function D.
>>>>> c) Merge A with B and all the C's.
>>>>> Repeat until there are no more functions to merge.
>>>>>
>>>>> As you can see, we need to compute a similarity measure
>>>>> for
>>>>> each
>>>>> pair of functions in a bucket. If I understand correctly,
>>>>> your patch
>>>>> reduces compile-time by avoiding comparisons of functions
>>>>> that
>>>>> cannot be identical. That's exactly what you want to
>>>>> do if
>>>>> you only
>>>>> want to merge identical functions. However, because we
>>>>> also
>>>>> need to
>>>>> determine whether functions are merely *similar*, I can't
>>>>> see how
>>>>> your idea could be applied in that case.
>>>>>
>>>>>
>>>>> Yeah, the existing pass only tries to merge identical
>>>>> functions
>>>>> (identical in machine code). I'm wondering if the "similar"
>>>>> function
>>>>> merging would be better as a separate pass (I'm genuinely
>>>>> wondering; I
>>>>> have no clue). I would wait for Nick's feedback.
>>>>>
>>>>> -- Sean Silva
>>>>>
>>>>>
>>>>> Looking at your statistics, I see that you are
>>>>> considering
>>>>> a very
>>>>> large number of functions. I don't know which benchmarks
>>>>> you are
>>>>> using, but I doubt that many of these are worth merging.
>>>>> For
>>>>> instance, you rarely gain a code size benefit from
>>>>> merging
>>>>> functions
>>>>> with just a few instructions. Our patch takes care of
>>>>> this
>>>>> using
>>>>> thresholds. It's worth looking at actual code size
>>>>> reductions,
>>>>> rather than just numbers of functions merged/ compilation
>>>>> time spent
>>>>> comparing functions. It turns out that the number of
>>>>> functions in
>>>>> each bucket is really quite small once you apply some
>>>>> heuristics
>>>>> (and could still be further reduced!).
>>>>>
>>>>> Given your experience with MergeFunctions, it would be
>>>>> really great
>>>>> if you could review our patch and also try it out on your
>>>>> benchmarks.
>>>>>
>>>>> Tobias
>>>>>
>>>>>
>>>>> On 24/01/2014 19:11, Stepan Dyatkovskiy wrote:
>>>>>
>>>>> Hi Tobias.
>>>>>
>>>>> So, what do you think?
>>>>>
>>>>> If it means to much extra-job for your team, may be I
>>>>> can help you
>>>>> somehow? I really would like to.
>>>>>
>>>>> -Stepan
>>>>>
>>>>> Stepan Dyatkovskiy wrote:
>>>>>
>>>>> Hi Tobias,
>>>>>
>>>>> I can't really see a way to combine our
>>>>> approach with
>>>>> your patch. What
>>>>> are your thoughts?
>>>>>
>>>>>
>>>>> I think it is possible. Even more - we need to
>>>>> combine our
>>>>> efforts, in
>>>>> order to bring this pass into real live.
>>>>> I'have read your presentation file, and
>>>>> unfortunately read
>>>>> your patch
>>>>> only a little.
>>>>> How exactly you scan functions on 2nd stage?
>>>>> Could you
>>>>> explain the
>>>>> algorithm in few words, how you compare workflow?
>>>>> Is it
>>>>> possible to
>>>>> scan binary tree instead of hash table? ...OK.
>>>>>
>>>>> That's how I see the modification. Now its
>>>>> important to
>>>>> explain idea,
>>>>> so consider the simplest case: you need to catch
>>>>> two
>>>>> functions that
>>>>> are differs with single instruction somewhere.
>>>>>
>>>>> 1. Imagine, that IR *module* contents is
>>>>> represented as
>>>>> binary tree:
>>>>> Each line (trace) from root to terminal node is a
>>>>> function.
>>>>> Each node - is function's primitive (instruction
>>>>> opcode, for
>>>>> example).
>>>>> Down - is direction to terminal nodes, up - is
>>>>> direction to
>>>>> the root.
>>>>>
>>>>> 2. Now you are standing on some node. And you
>>>>> have two
>>>>> directions
>>>>> down-left and down-right.
>>>>>
>>>>> 3. If you are able to find two equal sub-traces
>>>>> down (at
>>>>> left and at
>>>>> right), then the only difference lies in this
>>>>> node.
>>>>> Then we
>>>>> catch that
>>>>> case.
>>>>>
>>>>> 4. Even more, if two subtrees at left and at
>>>>> right are
>>>>> equal, than you
>>>>> catch at once all the cases that are
>>>>> represented by
>>>>> these
>>>>> subtrees.
>>>>>
>>>>> I still didn't look at you patch carefully.
>>>>> Sorry..
>>>>> But I
>>>>> hope that
>>>>> helps, and I'll try look at it in nearest time
>>>>> and
>>>>> perhaps
>>>>> its not the
>>>>> best solution I gave in this post.
>>>>>
>>>>> -Stepan
>>>>>
>>>>> 22.01.2014, 20:53, "Tobias von Koch"
>>>>> <tobias.von.koch at gmail.com
>>>>> <mailto:tobias.von.koch at gmail.com>
>>>>> <mailto:tobias.von.koch at gmail.__com
>>>>> <mailto:tobias.von.koch at gmail.com>>>:
>>>>>
>>>>>
>>>>> Hi Stepan,
>>>>>
>>>>> As you've seen we have recently implemented a
>>>>> significant enhancement to
>>>>> the MergeFunctions pass that also allows
>>>>> merging of
>>>>> functions that are
>>>>> only similar but not identical
>>>>> (http://llvm-reviews.__chandle__rc.com/D2591
>>>>> <http://chandlerc.com/D2591>
>>>>> <http://llvm-reviews.__chandlerc.com/D2591
>>>>> <http://llvm-reviews.chandlerc.com/D2591>>).
>>>>>
>>>>>
>>>>> Our patch also changes the way that functions
>>>>> are
>>>>> compared quite
>>>>> significantly. This is necessary because we
>>>>> need to
>>>>> compare all
>>>>> functions in a bucket in order to get a
>>>>> similarity
>>>>> measure for each
>>>>> pair, so we can then decide which 'groups' of
>>>>> functions
>>>>> to merge.
>>>>>
>>>>> I can't really see a way to combine our
>>>>> approach with
>>>>> your patch. What
>>>>> are your thoughts?
>>>>>
>>>>> Another way to improve the performance of
>>>>> MergeFunctions
>>>>> might be to
>>>>> make the hash function better. I've already
>>>>> added the
>>>>> size of the first
>>>>> BB to it, and perhaps there are other factors
>>>>> we could
>>>>> try... if we
>>>>> don't have to compare functions in the first
>>>>> place
>>>>> (because they're in
>>>>> different buckets) then that's obviously a
>>>>> much
>>>>> bigger win.
>>>>>
>>>>> Thanks,
>>>>> Tobias
>>>>>
>>>>> On 17/01/2014 20:25, Stepan Dyatkovskiy
>>>>> wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I propose simple improvement for
>>>>> MergeFunctions
>>>>> pass, that reduced
>>>>> its
>>>>> complexity from O(N^2) to O(log(N)),
>>>>> where N is
>>>>> number of
>>>>> functions in
>>>>> module.
>>>>>
>>>>> The idea, is to replace the result of
>>>>> comparison
>>>>> from "bool" to
>>>>> "-1,0,1". In another words: define
>>>>> order
>>>>> relation
>>>>> on functions set.
>>>>> To be sure, that functions could be
>>>>> comparable
>>>>> that way, we have to
>>>>> prove that order relation is possible
>>>>> on all
>>>>> comparison stage.
>>>>>
>>>>> The last one is possible, if for each
>>>>> comparison
>>>>> stage we
>>>>> implement has
>>>>> next properties:
>>>>> * reflexivity (a <= a, a == a, a >=
>>>>> a),
>>>>> * antisymmetry (if a <= b and b <= a
>>>>> then a == b),
>>>>> * transitivity (a <= b and b <= c,
>>>>> then
>>>>> a <= c)
>>>>> * asymmetry (if a < b, then a > b or a
>>>>> == b).
>>>>>
>>>>> Once we have defined order relation we
>>>>> can store
>>>>> all the functions in
>>>>> binary tree and perform lookup in
>>>>> O(log(N)) time.
>>>>>
>>>>> This post has two attachments:
>>>>> 1. The patch, that has
>>>>> implementation of
>>>>> this idea.
>>>>> 2. The MergeFunctions pass detailed
>>>>> description,
>>>>> with explanation how
>>>>> order relation could be possible.
>>>>>
>>>>> Hope it helps to make things better!
>>>>>
>>>>> -Stepan.
>>>>>
>>>>>
>>>>> ___________________________________________________
>>>>> LLVM Developers mailing list
>>>>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>>>>> <mailto:LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>>
>>>>> http://llvm.cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/____mailman/listinfo/llvmdev
>>>>> <http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev>
>>>>>
>>>>> <http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>>>>> <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>>
>>>>>
>>>>>
>>>>>
>>>>> ___________________________________________________
>>>>> LLVM Developers mailing list
>>>>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>>>>> <mailto:LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>>
>>>>> http://llvm.cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/____mailman/listinfo/llvmdev
>>>>> <http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev>
>>>>> <http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>>>>> <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>>>>> http://llvm.cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>
More information about the llvm-dev
mailing list