[LLVMdev] MergeFunctions: reduce complexity to O(log(N))
Stepan Dyatkovskiy
stpworld at narod.ru
Wed Mar 5 23:51:28 PST 2014
ping
Stepan Dyatkovskiy wrote:
> Hi Nick,
>
> I tried to rework changes as you requested. One of patches (0004 with
> extra assertions) has been removed.
>
> > + bool isEquivalentType(Type *Ty1, Type *Ty2) const {
> > + return cmpType(Ty1, Ty2) == 0;
> > + }
> >
> > Why do we still need isEquivalentType? Can we nuke this?
> Yup. After applying all the patches isEquivalentType will be totally
> replaced with cmpType. All isEqXXXX and friends will be removed in 0011
> (old 0012). No traces left.
> Old function wasn't removed in 0001 just for keeping patches without
> extra noise like:
>
> - something that uses isEquivalentType
> + something that uses cmpType
>
> The point is, that "something" that uses isEquivalentType, will be also
> replaced with one of next patches in this series.
>
> >
> > +static int cmpNumbers(uint64_t L, uint64_t R) {
> > + if (L < R) return -1;
> > + if (L > R) return 1;
> > + return 0;
> > +}
> >
> > At a high level, you don't actually need a <=> operator to use a sort. A
> > strict ordering ( < operator) is sufficient. (Note that for mergefunc, a
> > strict weak ordering is not sufficient, it must be a total ordering.)
> That could be done with int FunctionComparator::compare(). We can
> replace it with bool FunctionComparator::less(). Though for all other
> cmp methods need at least 2 bits of information as result:
> 1. Whether things are equal.
> 2. Whether left less than right.
>
> As for FunctionComparator::compare(), conversion into less() will
> increase time of sanity check (patch #0010).
> Sanity check is just a sweet bonus. It checks that ordering implemented
> properly (checks order relation properties).
> Turning compare() into less() mean, that we'll have to run comparison
> two times: L.less(R) and R.less(L). But may be sanity check is not a
> thing to be published at all.
>
> >
> > Consider hoisting this inside the FunctionComparator class? That class
> > should have a bunch of implementations of comparisons between various
> > different things, which can pass down to other methods in the same
> class.
> In new patch series attached to this post, I have moved all static
> methods into FunctionComparator.
>
> > + // Replacement for type::canLosslesslyBitCastTo, that
> > + // establish order relation on this kind of properties.
> > + int checkForLosslessBitcast(const Type *L, const Type *R);
> >
> > Type:: not type:: . Please make this comment more descriptive.
> Done.
> [new comment]
> Replacement for Type::canLosslesslyBitCastTo, that
> establish order relation on this kind of properties
> Returns 0, if L and R types could be converted to each other without
> reinterpretation of bits.
> Otherwise method returns -1 or 1, defining total ordering between
> types in context of lossless bitcastability trait.
> E.g.: if L is less than R (result is -1), than every type that could be
> losslessly bitcasted to L is less than R.
> [/new comment]
>
> >
> > + /// Replacement for C1 == getBitCast(C2, C1Ty)
> > + /// Its more controllable, and supposed to be simpler and more
> > predictionable.
> > + /// As very important advantage: it allows to introduce order
> relation on
> > + /// constants set, and thus use it as trait in refinement routines.
> >
> > "Its" --> "It's". "predictionable" --> "predictable". And how is it more
> > predictable? I think this comment would be better if it described the
> > function instead of making comparisons between it and other functions.
> > Something like, "Compare constants under a system where pointer to X and
> > pointer to Y are considered equal" or whatever is actually true here.
> Done.
> [new comment]
> Replacement for C1 == getBitCast(C2, C1Ty)
> Parses constants contents, assuming that types are losslessly
> bitcasted between each other. So actually it ignores types and only
> compares bits from L and R.
> Returns 0, if L and R has equivalent content.
> -1 or 1 if values are different. Maintaining total ordering requires
> two values that indicates non-equivalence (L less R, L greater R).
> [/new comment]
>
> >
> > +enum ConstantType {
> > I'm not sure that this buys you much. All the "isa" tests can be broken
> > down into a switch on getValueID() with the one exception of
> isNullValue().
> Done.
>
> > + assert(
> > + C1->getType()->canLosslesslyBitCastTo(C2->getType()) &&
> > + "Pass is healthless. checkForLosslessBitcast should be twin of "
> > + "canLosslesslyBitCastTo method, except the case when the last
> one "
> > + "returns false, the first one should return -1 or 1");
> ...
> > I think we can skip the asserts here. They aren't detecting a specific
> > bug, they're checking whether the new code does a certain task relative
> > to the old code. Drop the old code, your new code is the new sheriff in
> > town.
> Asserts has been removed.
>
> >
> > + DenseMap<const Value*, int> sn_map1, sn_map2;
> >
> > What is sn short for ("seen")? Why are there two of these?
> Serial number :-)
> >
> > + std::pair<DenseMap<const Value *, int>::iterator, bool>
> > + LeftSN = sn_map1.insert(std::make_pair(V1, sn_map1.size())),
> > + RightSN = sn_map2.insert(std::make_pair(V2, sn_map2.size()));
> >
> > So I think I get it, this is easy to reason about. You number each value
> > going down both functions. But I think you can eliminate one of these
> > maps because you know that if the left and right were ever different
> > then we terminate the comparison immediately. You can at least share one
> > map with both V1 and V2, but I think you can reduce the number of map
> > insertions.
> Not sure. Consider that in left you have:
> %A = alloca i32
> %B = alloca i32
> store i32 1, i32* %A
>
> And in right:
> %A = alloca i32
> %B = alloca i32
> store i32 1, i32* %B
>
> When we meet 'store' instruction we need to check in which order %A was
> allocated at left and in which order %B was allocated at right.
> So for each value in function we have to assign serial number. Then
> comparing to local values from different functions we can just check
> their serial numbers.
> Though, may be I missed something? May be there is another way to
> determine "who was first".
>
> > High-level question, are attributes ordered? Your comparison is ordered.
> > So if I have function F with string attributes "sse2", "gc=hemisphere"
> > and another function G with string attributes "gc=hemisphere", "sse2"
> > then they would be considered different. I don't think that's right.
> I didn't check it. But if it is not ordered, we could order it
> lexicographically.
>
> >
> > + int cmpOperation(const Instruction *I1, const Instruction *I2) const;
> > +
> > bool isEquivalentOperation(const Instruction *I1,
> > - const Instruction *I2) const;
> > + const Instruction *I2) const {
> > + return cmpOperation(I1, I2) == 0;
> > + }
> >
> > Hopefully this patch series ultimately eliminates calls of
> > isEquivalentOperation?
> Of course. Just like isEquivalentType.
>
> > By the way, this is an interesting one. Should "add x, y" and "add nsw
> > x, y" be equal or not?
> Yeah. Those are interesting questions. We could even treat 'mul a, 2'
> and 'shl a,1' as equal in some cases. I think it's matter of time and
> further optimization.
>
> Please find first reworked patches in attachment.
>
> -Stepan
>
> Nick Lewycky wrote:
>> /// Compare two Types, treating all pointer types as equal.
>> - bool isEquivalentType(Type *Ty1, Type *Ty2) const;
>> + int cmpType(Type *Ty1, Type *Ty2) const;
>> +
>> + /// Compare two Types, treating all pointer types as equal.
>> + bool isEquivalentType(Type *Ty1, Type *Ty2) const {
>> + return cmpType(Ty1, Ty2) == 0;
>> + }
>>
>> Why do we still need isEquivalentType? Can we nuke this?
>>
>> +static int cmpNumbers(uint64_t L, uint64_t R) {
>> + if (L < R) return -1;
>> + if (L > R) return 1;
>> + return 0;
>> +}
>>
>> At a high level, you don't actually need a <=> operator to use a sort. A
>> strict ordering ( < operator) is sufficient. (Note that for mergefunc, a
>> strict weak ordering is not sufficient, it must be a total ordering.)
>>
>> Consider hoisting this inside the FunctionComparator class? That class
>> should have a bunch of implementations of comparisons between various
>> different things, which can pass down to other methods in the same class.
>>
>> + // Replacement for type::canLosslesslyBitCastTo, that
>> + // establish order relation on this kind of properties.
>> + int checkForLosslessBitcast(const Type *L, const Type *R);
>>
>> Type:: not type:: . Please make this comment more descriptive.
>>
>> + /// Replacement for C1 == getBitCast(C2, C1Ty)
>> + /// Its more controllable, and supposed to be simpler and more
>> predictionable.
>> + /// As very important advantage: it allows to introduce order
>> relation on
>> + /// constants set, and thus use it as trait in refinement routines.
>>
>> "Its" --> "It's". "predictionable" --> "predictable". And how is it more
>> predictable? I think this comment would be better if it described the
>> function instead of making comparisons between it and other functions.
>> Something like, "Compare constants under a system where pointer to X and
>> pointer to Y are considered equal" or whatever is actually true here.
>>
>> +enum ConstantType {
>>
>> I'm not sure that this buys you much. All the "isa" tests can be broken
>> down into a switch on getValueID() with the one exception of
>> isNullValue().
>>
>> + // Check whether C2 and C1 has equal bit patterns.
>> + if (checkForLosslessBitcast(C1->getType(), C2->getType()) != 0)
>> + return false;
>> +
>> + assert(
>> + C1->getType()->canLosslesslyBitCastTo(C2->getType()) &&
>> + "Pass is healthless. checkForLosslessBitcast should be twin of "
>> + "canLosslesslyBitCastTo method, except the case when the last
>> one "
>> + "returns false, the first one should return -1 or 1");
>> +
>> + // Compare constant contents
>> + if (cmpConstants(C1, C2) != 0)
>> + return false;
>> +
>> + assert(
>> + C1 ==
>> + ConstantExpr::getBitCast(const_cast<Constant*>(C2),
>> C1->getType()) &&
>> + "Pass is healthless. cmpConstants should detect *subset* of "
>> + "equivalent constants that could be detected by "
>> + "getBitCast method. So whereas cmpConstants returns 0 "
>> + "getBitCast also must return true. And wheres getBitCast "
>> + "returns false, cmpConstants also must return -1 or 1.");
>>
>> I think we can skip the asserts here. They aren't detecting a specific
>> bug, they're checking whether the new code does a certain task relative
>> to the old code. Drop the old code, your new code is the new sheriff in
>> town.
>>
>> + DenseMap<const Value*, int> sn_map1, sn_map2;
>>
>> What is sn short for ("seen")? Why are there two of these?
>>
>> + std::pair<DenseMap<const Value *, int>::iterator, bool>
>> + LeftSN = sn_map1.insert(std::make_pair(V1, sn_map1.size())),
>> + RightSN = sn_map2.insert(std::make_pair(V2, sn_map2.size()));
>>
>> So I think I get it, this is easy to reason about. You number each value
>> going down both functions. But I think you can eliminate one of these
>> maps because you know that if the left and right were ever different
>> then we terminate the comparison immediately. You can at least share one
>> map with both V1 and V2, but I think you can reduce the number of map
>> insertions.
>>
>> +// Mostly cloned from BitcodeWriter, but simplified a bit.
>>
>> Missing indent.
>>
>> +static int cmpAttrs(const AttributeSet L, const AttributeSet R) {
>>
>> High-level question, are attributes ordered? Your comparison is ordered.
>> So if I have function F with string attributes "sse2", "gc=hemisphere"
>> and another function G with string attributes "gc=hemisphere", "sse2"
>> then they would be considered different. I don't think that's right.
>>
>> + int cmpOperation(const Instruction *I1, const Instruction *I2) const;
>> +
>> bool isEquivalentOperation(const Instruction *I1,
>> - const Instruction *I2) const;
>> + const Instruction *I2) const {
>> + return cmpOperation(I1, I2) == 0;
>> + }
>>
>> Hopefully this patch series ultimately eliminates calls of
>> isEquivalentOperation?
>>
>> + if (int Res = cmpNumbers(I1->getRawSubclassOptionalData(),
>> + I2->getRawSubclassOptionalData()))
>>
>> By the way, this is an interesting one. Should "add x, y" and "add nsw
>> x, y" be equal or not? Depends whether the backend uses it, and whether
>> we're going to do more optimization afterwards (ie., this is the compile
>> step of an LTO build). Nothing for you to change here, just want you to
>> be aware that this is a point where we might want to have a flag and
>> make mergefunc behave differently in different compile strategies.
>>
>> (I stopped before getting to 0008.)
>>
>> Nick
>>
>>
>> On 3 February 2014 11:11, Stepan Dyatkovskiy <stpworld at narod.ru
>> <mailto:stpworld at narod.ru>> wrote:
>>
>> Hi all,
>>
>> Previous patch has been split onto series of small changes.
>> On each stage (after each patch) MergeFunctions pass is compilable
>> and stable.
>>
>> Please find patches in attachment for review.
>>
>> To apply all patches at once, use "apply-patches.sh" script as
>> follows:
>> 0. Place "apply-patches.sh" in same directory with patches.
>> 1. cd <llvm-sources-dir>
>> 2. "bash <path-to-patches-dir>/apply-__patches.sh"
>>
>> -Stepan
>>
>> Stepan Dyatkovskiy wrote:
>>
>> Hi Nick,
>> About partition refinement.
>>
>>
>> It looks like patch proposed here allows to apply your idea with
>> painless way (as series of small changes actually).
>>
>> If I got it right, you would like to introduce the next
>> partition-refinement.
>>
>> Initial partition P consists of only one functions class X0
>> (whole
>> module, single bucket). Suppose S is subset of X0, so S
>> represents
>> functions with some trait that is absent in rest part of X0.
>>
>> Then refinement looks like below:
>>
>> P`=refine(P, S):
>> X`[0] = intersection(X[0], S)
>> X`[1] = difference(X[0], S) // X0 without S0
>>
>> so P`={X`[0], X`[1]}
>>
>> And actually X`[0] and X`[1] are classes of functions that
>> differs with
>> trait defined by S.
>>
>> Then (suppose, we need to introduce new trait, so it would be
>> S`):
>>
>> P``=refine(P`, S`), so each X`` could be defined like below:
>> for each X`[i] in P`:
>> X``[i*2] = intersection(X`[i], S`)
>> X``[i*2+1] = difference(X`[i], S`)
>>
>> and so on, until all the traits are over.
>>
>> Finally we got the classes (set of Xs) that consists of
>> equivalent
>> functions. So, we can merge functions inside the class bounds.
>>
>> Am I right?
>>
>> Patch, that was proposed here could be represented as such
>> partition
>> refinement then:
>>
>> On each refinement stage, you need to introduce S somehow, the
>> algorithm
>> that answers the question: "what sort of difference we want to
>> check in
>> this bucket?"
>>
>> In existing implementation we check traits already, but pair-wise
>> (attributes, calling convention, number of arguments, number of
>> basic
>> blocks, etc.). How to to use the same traits in partition
>> refinement?
>>
>> Consider next example. Suppose we want to split some bucket X:
>>
>> 0. Take some trait from *pair-wise* comparison. For example:
>> number of
>> arguments.
>> 1. Take some function from bucket X, let it be F1. Let F1 has N1
>> number
>> of arguments.
>> 2. Then S are functions with number of arguments *less* than N1.
>> 3. Do refinement by S. As result we got X`[0] and X`[1].
>> 4. If both of X`[0] and X`[1] are not empty, then repeat #1-#3
>> for them.
>> 5. If some of X`[0] of X`[1] is empty set, then all functions in
>> X are
>> the same in context of trait S. We need another trait, go to
>> #0. And
>> repeat #0-#5 for all X` buckets we have.
>>
>> Finally we got the groups of equivalent functions. And actually
>> these
>> groups would be organized in binary tree.
>> Most of stuff is implemented in std::map. I showed it as #0-#5
>> just to
>> put it into context of your idea. The only problem was to
>> introduce way
>> of how to define order-relation (or how to define S). But this
>> problem
>> has been solved.
>> And on output we got groups of equal functions.
>>
>> Fix me if I'm wrong. But it looks like every advantage given by
>> partition refinement is present here.
>>
>> As I said in beginning, it is way of how to commit things you
>> proposed
>> with minimum of changes. We will use old comparison routines,
>> only thing
>> we need to change is return type (from bool to -1,0,1).
>> In this way, partition refinement could be committed as series
>> of small
>> changes: one comparison routine per commit.
>>
>> Every change in MergeFunctions from now could be well checked by
>> this
>> builder:
>> Now we have special builder for MergeFunctions. It runs
>> test-suite with
>> forced MergeFunctions pass:
>>
>> http://lab.llvm.org:8011/__builders/clang-mergefunc-x86___64-freeBSD9.2
>>
>> <http://lab.llvm.org:8011/builders/clang-mergefunc-x86_64-freeBSD9.2>
>> Currently, MergeFunctions pass is stable, builder is green and
>> passes
>> all test-suite programs.
>>
>> -Stepan.
>>
>> Stepan Dyatkovskiy wrote:
>>
>> Hi all,
>> Please find the updated patch in attachment:
>> * Added some comments.
>> * Fixed some typos.
>>
>> -Stepan
>>
>> Nick Lewycky wrote:
>>
>> On 30 January 2014 01:27, Stepan Dyatkovskiy
>> <stpworld at narod.ru <mailto:stpworld at narod.ru>
>> <mailto:stpworld at narod.ru <mailto:stpworld at narod.ru>>>
>> wrote:
>>
>> Hello Sean and Tobias,
>>
>> Sean,
>> Thank you. Could you describe Nick's ideas in few
>> words or give me
>> links to your discussion, so I could adapt my ideas
>> to it.
>>
>>
>> Sorry I haven't had time to write it up. The essential
>> idea is that we
>> use partition-refinement to determine the differences
>> between a group of
>> functions instead of using pair-wise comparison.
>>
>> Imagine you want to compare long sequences of numbers.
>> You can grab a
>> pair of sequences and run down them until they differ or
>> you reach the
>> end, and move on to another pair of sequences. With
>> partition refinement
>> you start with the first number in all the sequences and
>> bucket them.
>> Say, three 1's, two 2's, and an 8. You've just formed 3
>> buckets. For
>> each bucket (with more than 1 member), do it again with
>> the next number
>> in the sequence. This gets you an equivalence comparison
>> at less than
>> the complexity of pairwise comparisons. (The buckets are
>> partitions, and
>> you're refining them as you read more data. Partition
>> refinement is the
>> dual of union-find, which we use quite heavily in the
>> compiler.)
>>
>> I haven't figured out how this stacks up against
>> Stepan's sorting with
>> <=> comparison. I can probably do that, I just haven't
>> spent the time
>> thinking about it.
>>
>> I also haven't thought through how to integrate it with
>> "near miss"
>> comparison. My current leading idea is that you do the
>> same thing, but
>> make a record of what makes two buckets similar. (Note
>> that the wrong
>> solution is to store a per-function-pair list of
>> similarities/differences.)
>>
>> Nick
>>
>> Tobias,
>> Your patch fails on several modules in my benchmark
>> (73 of ~1800
>> tests). I have sent one as attachment.
>>
>> See statistics files for more details, all the .ll
>> files you could
>> simply find in test-suite object directory (after
>> "make TEST=nightly
>> report").
>> I have attached script that launches benchmark.
>> Example of usage
>> (you need Linux OS):
>> bash test-scipt.sh report.txt opt-no-patch
>> opt-with-patch
>>
>> Some numbers.
>>
>> Total number of functions: 52715
>>
>> Below, by default, I have excluded modules where
>> your patch fails.
>>
>> Functions merged
>> Original version: 1113
>> Order relation patch: 1112
>> Similar merging patch: 541
>>
>> Functions merged (including failed tests)
>> Original version: 8398
>> Order relation patch: 8395
>>
>> Summary files size
>> Initial: 163595634
>> bytes.
>> After merging with order relation patch: 163147731
>> bytes.
>> After similar merging patch: 162491905
>> bytes.
>>
>> Summary files size (including failed tests)
>> Initial: 250242266 bytes.
>> Original version: 247168198 bytes.
>> Order relation: 247175650 bytes.
>>
>> Time. I measured with "time" utility, and used
>> "real (wall clock)
>> time used by the process".
>>
>> Summary time spent
>> With existing version: 28.05 secs.
>> With order relation patch: 28.19 secs.
>> Similar merging patch: 28.61 secs.
>>
>> Summary time spent (including failed tests)
>> With existing version: 41.74 secs.
>> With order relation patch: 36.35 secs.
>>
>> -Stepan
>>
>> Sean Silva wrote:
>>
>>
>>
>>
>> On Tue, Jan 28, 2014 at 2:47 PM, Tobias von Koch
>> <tobias.von.koch at gmail.com
>> <mailto:tobias.von.koch at gmail.com>
>> <mailto:tobias.von.koch at gmail.__com
>> <mailto:tobias.von.koch at gmail.com>>
>> <mailto:tobias.von.koch at gmail.
>> <mailto:tobias.von.koch at gmail.>____com
>> <mailto:tobias.von.koch at gmail.__com
>> <mailto:tobias.von.koch at gmail.com>>>> wrote:
>>
>> Hi Stepan,
>>
>> Sorry for the delay. It's great that you
>> are working on
>> MergeFunctions as well and I agree, we
>> should definitely
>> try to
>> combine our efforts to improve
>> MergeFunctions.
>>
>> Just to give you some context, the pass
>> (with the similar
>> function
>> merging patch) is already being used in a
>> production
>> setting. From
>> my point of view, it would be better if we
>> focus on
>> improving its
>> capability of merging functions at this
>> stage rather
>> than on
>> reduction of compilation time.
>>
>> I'll give a brief overview of how our
>> similar function
>> merging
>> algorithm works (you can also watch the
>> presentation from
>> the US
>> LLVM conference to hear me explain it
>> using an animation).
>>
>> 1. Hash all functions into buckets
>>
>> In each bucket, separately:
>>
>> 2. Compare functions pair-wise and
>> determine a
>> similarity metric for each pair (%age
>> of equivalent
>> instructions)
>>
>> 3. Merge identical functions (with
>> similarity = 100%),
>> update call
>> sites for those functions.
>>
>> 4. If the updates of call sites have
>> touched other
>> functions,
>> go back to step 2 and re-compare
>> *only* those
>> functions to all
>> others in their buckets.
>>
>> Finally,
>>
>> 5. Form groups of similar functions for
>> merging:
>> a) Pick most similar pair (A,B)
>> b) Find all functions C that are also
>> similar to B but
>> are not
>> more similar to some other
>> function D.
>> c) Merge A with B and all the C's.
>> Repeat until there are no more
>> functions to merge.
>>
>> As you can see, we need to compute a
>> similarity measure for
>> each
>> pair of functions in a bucket. If I
>> understand correctly,
>> your patch
>> reduces compile-time by avoiding
>> comparisons of functions
>> that
>> cannot be identical. That's exactly what
>> you want to do if
>> you only
>> want to merge identical functions.
>> However, because we also
>> need to
>> determine whether functions are merely
>> *similar*, I can't
>> see how
>> your idea could be applied in that case.
>>
>>
>> Yeah, the existing pass only tries to merge
>> identical functions
>> (identical in machine code). I'm wondering if
>> the "similar"
>> function
>> merging would be better as a separate pass (I'm
>> genuinely
>> wondering; I
>> have no clue). I would wait for Nick's feedback.
>>
>> -- Sean Silva
>>
>>
>> Looking at your statistics, I see that you
>> are considering
>> a very
>> large number of functions. I don't know
>> which benchmarks
>> you are
>> using, but I doubt that many of these are
>> worth merging.
>> For
>> instance, you rarely gain a code size
>> benefit from merging
>> functions
>> with just a few instructions. Our patch
>> takes care of this
>> using
>> thresholds. It's worth looking at actual
>> code size
>> reductions,
>> rather than just numbers of functions
>> merged/ compilation
>> time spent
>> comparing functions. It turns out that the
>> number of
>> functions in
>> each bucket is really quite small once you
>> apply some
>> heuristics
>> (and could still be further reduced!).
>>
>> Given your experience with MergeFunctions,
>> it would be
>> really great
>> if you could review our patch and also try
>> it out on your
>> benchmarks.
>>
>> Tobias
>>
>>
>> On 24/01/2014 19:11, Stepan Dyatkovskiy
>> wrote:
>>
>> Hi Tobias.
>>
>> So, what do you think?
>>
>> If it means to much extra-job for your
>> team, may be I
>> can help you
>> somehow? I really would like to.
>>
>> -Stepan
>>
>> Stepan Dyatkovskiy wrote:
>>
>> Hi Tobias,
>>
>> I can't really see a way to
>> combine our
>> approach with
>> your patch. What
>> are your thoughts?
>>
>>
>> I think it is possible. Even more
>> - we need to
>> combine our
>> efforts, in
>> order to bring this pass into real
>> live.
>> I'have read your presentation
>> file, and
>> unfortunately read
>> your patch
>> only a little.
>> How exactly you scan functions on
>> 2nd stage?
>> Could you
>> explain the
>> algorithm in few words, how you
>> compare workflow?
>> Is it
>> possible to
>> scan binary tree instead of hash
>> table? ...OK.
>>
>> That's how I see the modification.
>> Now its
>> important to
>> explain idea,
>> so consider the simplest case: you
>> need to catch
>> two
>> functions that
>> are differs with single
>> instruction somewhere.
>>
>> 1. Imagine, that IR *module*
>> contents is
>> represented as
>> binary tree:
>> Each line (trace) from root to
>> terminal node is a
>> function.
>> Each node - is function's
>> primitive (instruction
>> opcode, for
>> example).
>> Down - is direction to terminal
>> nodes, up - is
>> direction to
>> the root.
>>
>> 2. Now you are standing on some
>> node. And you
>> have two
>> directions
>> down-left and down-right.
>>
>> 3. If you are able to find two
>> equal sub-traces
>> down (at
>> left and at
>> right), then the only difference
>> lies in this node.
>> Then we
>> catch that
>> case.
>>
>> 4. Even more, if two subtrees at
>> left and at
>> right are
>> equal, than you
>> catch at once all the cases that
>> are represented by
>> these
>> subtrees.
>>
>> I still didn't look at you patch
>> carefully. Sorry..
>> But I
>> hope that
>> helps, and I'll try look at it in
>> nearest time and
>> perhaps
>> its not the
>> best solution I gave in this post.
>>
>> -Stepan
>>
>> 22.01.2014, 20:53, "Tobias von
>> Koch"
>> <tobias.von.koch at gmail.com
>> <mailto:tobias.von.koch at gmail.com>
>> <mailto:tobias.von.koch at gmail.__com
>> <mailto:tobias.von.koch at gmail.com>>
>> <mailto:tobias.von.koch at gmail.
>> <mailto:tobias.von.koch at gmail.>____com
>> <mailto:tobias.von.koch at gmail.__com
>> <mailto:tobias.von.koch at gmail.com>>>>:
>>
>>
>> Hi Stepan,
>>
>> As you've seen we have
>> recently implemented a
>> significant enhancement to
>> the MergeFunctions pass that
>> also allows
>> merging of
>> functions that are
>> only similar but not identical
>>
>> (http://llvm-reviews.__chandle____rc.com/D2591
>> <http://chandle__rc.com/D2591>
>> <http://chandlerc.com/D2591>
>>
>> <http://llvm-reviews.__chandle__rc.com/D2591
>> <http://chandlerc.com/D2591>
>> <http://llvm-reviews.__chandlerc.com/D2591
>> <http://llvm-reviews.chandlerc.com/D2591>>>).
>>
>>
>> Our patch also changes the way
>> that functions
>> are
>> compared quite
>> significantly. This is
>> necessary because we
>> need to
>> compare all
>> functions in a bucket in order
>> to get a
>> similarity
>> measure for each
>> pair, so we can then decide
>> which 'groups' of
>> functions
>> to merge.
>>
>> I can't really see a way to
>> combine our
>> approach with
>> your patch. What
>> are your thoughts?
>>
>> Another way to improve the
>> performance of
>> MergeFunctions
>> might be to
>> make the hash function better.
>> I've already
>> added the
>> size of the first
>> BB to it, and perhaps there
>> are other factors
>> we could
>> try... if we
>> don't have to compare
>> functions in the first
>> place
>> (because they're in
>> different buckets) then that's
>> obviously a much
>> bigger win.
>>
>> Thanks,
>> Tobias
>>
>> On 17/01/2014 20:25, Stepan
>> Dyatkovskiy wrote:
>>
>> Hi all,
>>
>> I propose simple
>> improvement for
>> MergeFunctions
>> pass, that reduced
>> its
>> complexity from O(N^2)
>> to O(log(N)),
>> where N is
>> number of
>> functions in
>> module.
>>
>> The idea, is to replace
>> the result of
>> comparison
>> from "bool" to
>> "-1,0,1". In another
>> words: define order
>> relation
>> on functions set.
>> To be sure, that
>> functions could be
>> comparable
>> that way, we have to
>> prove that order
>> relation is possible
>> on all
>> comparison stage.
>>
>> The last one is
>> possible, if for each
>> comparison
>> stage we
>> implement has
>> next properties:
>> * reflexivity (a <= a,
>> a == a, a >= a),
>> * antisymmetry (if a <=
>> b and b <= a
>> then a == b),
>> * transitivity (a <= b
>> and b <= c, then
>> a <= c)
>> * asymmetry (if a < b,
>> then a > b or a
>> == b).
>>
>> Once we have defined
>> order relation we
>> can store
>> all the functions in
>> binary tree and perform
>> lookup in
>> O(log(N)) time.
>>
>> This post has two
>> attachments:
>> 1. The patch, that has
>> implementation of
>> this idea.
>> 2. The MergeFunctions
>> pass detailed
>> description,
>> with explanation how
>> order relation could be
>> possible.
>>
>> Hope it helps to make
>> things better!
>>
>> -Stepan.
>>
>>
>>
>> _____________________________________________________
>> LLVM Developers mailing
>> list
>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>> <mailto:LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>>
>> <mailto:LLVMdev at cs.uiuc.edu
>> <mailto:LLVMdev at cs.uiuc.edu> <mailto:LLVMdev at cs.uiuc.edu
>> <mailto:LLVMdev at cs.uiuc.edu>>>
>> http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/______mailman/listinfo/llvmdev
>> <http://lists.cs.uiuc.edu/____mailman/listinfo/llvmdev>
>>
>> <http://lists.cs.uiuc.edu/____mailman/listinfo/llvmdev
>> <http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev>>
>>
>>
>> <http://lists.cs.uiuc.edu/____mailman/listinfo/llvmdev
>> <http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev>
>>
>> <http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>> <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>>>
>>
>>
>>
>>
>> _____________________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>> <mailto:LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>>
>> <mailto:LLVMdev at cs.uiuc.edu
>> <mailto:LLVMdev at cs.uiuc.edu> <mailto:LLVMdev at cs.uiuc.edu
>> <mailto:LLVMdev at cs.uiuc.edu>>>
>> http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/______mailman/listinfo/llvmdev
>> <http://lists.cs.uiuc.edu/____mailman/listinfo/llvmdev>
>>
>> <http://lists.cs.uiuc.edu/____mailman/listinfo/llvmdev
>> <http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev>>
>>
>> <http://lists.cs.uiuc.edu/____mailman/listinfo/llvmdev
>> <http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev>
>>
>> <http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>> <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>>>
>>
>>
>>
>>
>> _________________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>> <mailto:LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>>
>> http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>> <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>>
>>
>>
>>
>> _________________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>> http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>> <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>>
>>
>> _________________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>> http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>> <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>>
>>
>>
>>
>>
>
More information about the llvm-dev
mailing list