[LLVMdev] MergeFunctions: reduce complexity to O(log(N))
Stepan Dyatkovskiy
stpworld at narod.ru
Thu May 15 06:32:16 PDT 2014
Hi Nick,
What about rest of patches? I have fixed 0004 and 0005, others seems to
be OK for you.. Can I commit them?
Thanks!
-Stepan
Stepan Dyatkovskiy wrote:
> Hi Nick,
> Thanks you for reviews. I have corrected for of comments from FAQ to
> what seems to be regular in LLVM.
> Patches has been committed as r208173 and r208189.
>
> Thanks!
> -Stepan
>
> Nick Lewycky wrote:
>> On 29 April 2014 04:36, Stepan Dyatkovskiy <stpworld at narod.ru
>> <mailto:stpworld at narod.ru>> wrote:
>>
>> Hi Nick.
>>
>> I have fixed and reattached all the patches from 0001 till 0011.
>>
>> 0001, 0002:
>> I have added detailed order descriptions to the methods declaration.
>> I have also fixed everything as you asked.
>>
>>
>> 0001:
>>
>> + for (uint64_t i = 0; i < NumElementsL; ++i) {
>> vs.
>> + for (unsigned i = 0, e = NumElementsL; i != e; ++i) {
>>
>> This difference is necessary to avoid a warning? I don't actually care
>> which form you use here given that NumElementsL is already an integer
>> there's no recalculation we're avoiding like getting the end iterator
>> each time, I just want it to look consistent. If it's working around a
>> warning that's more important than consistent looks.
>>
>> 0001 looks good. Please commit it.
>>
>> 0002:
>>
>> One note, you use a writing style of asking a question, followed by
>> answering it. For instance, "Will we still get the same numbering? As
>> follows from FunctionComparator::compare(), we do CFG walk, ...". I
>> don't think it's worth your time to rewrite it now, but we generally
>> don't use that construction in llvm's comments, instead preferring to
>> explain what the code does an including an argument that explains why
>> the approach is correct. This amounts to the same thing, it's just a
>> different style.
>>
>> + /// their functions. If right value was meet first during
>> scanning,
>>
>> "was meet" --> "was met"
>>
>> Please commit it.
>>
>> Nick
>>
>>
>> 0003, cmpStrings
>> StringRef::compare performs lexicographical comparison, and it isn't
>> the fastest one.
>> In case, when strings has different length, it still compares their
>> prefixes lexicographically.
>> We need ordering among strings set, but we don't need to compare
>> strings contents lexicographically all the time. So we can compare
>> their sizes first, and do lexicographical comparison only when we
>> really need it: only when strings are equal in size.
>>
>> I have added short version of these explanations into cmpStrings. So
>> we could get really well documented methods based on our posts ;-)
>> Those were good questions and definitely the answers should be in
>> documentation somewhere.
>>
>> 0004: I have added ordering description.
>> 0005: Fixed.
>> 0007: Fixed.
>> 0009:
>>
>> > Okay, I'm confused. We do all this work to produce a 3-way
>> comparison,
>> > then we wrap it up in an operator< for a std::set that will only
>> look at
>> > less-than...
>> Actually, on top level we don't need 3-way comparison. But we do
>> need it to determine whether functions are equal on some stage, just
>> to proceed to the next "less significant" stage.
>> Consider two functions (where BB1 and BB11 are equal)
>>
>> "F1: {BB1, BB2}" and
>> "F2: {BB11, BB3}".
>>
>> First we have to realize that BB1 and BB11 are equal. And then
>> determine which one of BB2 and BB3 is less than another.
>> So finally we got 3-way comparison: less, equal and greater (in fact
>> just neither less nor equal).
>>
>>
>> > When we start detecting equal functions, do we merge them
>> immediately? If so, can't that cause functions already in the set to
>> have their equality comparisons change
>> Yes we merge them immediately. And no, it won't cause situation you
>> have mentioned. Once we changed function X, we lookup for functions
>> that uses X, remove them from FnTree (former FnSet) and put them
>> into Deferred list. Then we fill up worklist with Deferred contents
>> and repeat comparison routine for these functions.
>>
>>
>> Thanks for reviews!
>>
>> -Stepan
>>
>> Stepan Dyatkovskiy wrote:
>>
>> ping
>> Stepan Dyatkovskiy wrote:
>>
>> Hi Nick,
>> Thank you for reviews! Here is my answers for first two
>> patches.
>>
>> Though as additional, I think its important answer to this
>> question
>> first:
>> > Okay, I'm confused. We do all this work to produce a
>> 3-way comparison,
>> > then we wrap it up in an operator< for a std::set that
>> will only
>> look at
>> > less-than...
>> Actually, on top level we don't need 3-way comparison. But
>> we do need it
>> to determine whether functions are equal on some stage just
>> to proceed
>> to the next "less meaningful" stage.
>> Consider two functions
>> "F1: {BB1, BB2}" and
>> "F2: {BB-equal-to-BB1, BB3}".
>> First we have to realize that BB1 and BB-equal-to-BB1 are
>> equal. And
>> then determine which one of BB2 and BB3 is less.
>>
>> First patch.
>>
>> > + /// 1. Bitcastability.
>> > + /// Check whether L's type could be losslessly
>> bitcasted to R's
>> type.
>> > + /// On this stage method, in case when lossless
>> bitcast is not
>> possible
>> > + /// method returns -1 or 1, thus also defining which
>> type is
>> greater in
>> > + /// context of bitcastability.
>> I have added detailed order description for this method. You
>> also can
>> find this description in cmpConstantsComment.txt.
>>
>> > Optional: "Return its pointers comparison result" ->
>> "Return the
>> result
>> > of comparing the types"?
>> Oops! Fixed.
>>
>> > + if (TyL->getTypeID() == Type::X86_MMXTyID)
>> > + TyLWidth = 64;
>> > + else if (const VectorType *VecTyL =
>> dyn_cast<VectorType>(TyL))
>> > + TyLWidth = VecTyL->getBitWidth();
>> >
>> > Yikes! We never merge function A using a vector and
>> function B
>> using an
>> > x86_mmx type together, right?
>> Fixed!
>>
>> > "Nor of TyL and TyR" -> "Neither Tyl nor TyR".
>> >
>> > Optional: "Return its pointers comparison result" ->
>> "Return the
>> result
>> > of comparing the types"?
>> Fixed.
>>
>> > + case Value::UndefValueVal: return
>> cmpType(L->getType(),
>> R->getType());
>> >
>> > You already have L->getType() and R->getType() in TyL
>> and TyR.
>> Fixed. Sorry.
>>
>> Second patch.
>>
>> > I'm curious whether we guarantee that
>> > we get the same numbering when comparing two functions
>> if I simply
>> > reorder the BasicBlocks in one of the functions, will we
>> still assign
>> > the same numberings or not?
>> ...
>> > Why not just start at the beginning and run to
>> > the end, relying on the SSA dominance property.
>> Good questions!
>>
>> For first one I have added these notes:
>> [quote]
>> As follows from FunctionComparator::compare(), we do CFG
>> walk, we start
>> from entry, and then take each terminator. So it doesn't
>> matter how in
>> fact BBs are ordered in function. And since cmpValues are
>> called during
>> this walk, the numbering depends only on how BBs located
>> inside the CFG.
>> [/quote]
>>
>> As for second question, I have added next notes:
>>
>> [quote]
>> We also can't use dominance properties. Consider situation:
>> If we compare two instruction operands: first is usage of
>> local variable
>> AL from function FL, and second is usage of local variable
>> AR from FR.
>> We are still not able to compare operands of PHI, since
>> those could be
>> operands from further BBs we didn't scan yet.
>> [/quote]
>>
>> You can find comments also in sn_mapLRComment.txt
>> I have also described lexicographical ordering for cmpValues
>> in method
>> comments. I have also put them in separated txt file:
>> cmpValuesComment.txt.
>>
>> > + if (L == R) return 0;
>> > + // Compare C1 and the bitcast result.
>> > + if (int Res = cmpConstants(ConstL, ConstR))
>> > + return Res;
>> > [...]
>> > + return 0;
>> >
>> > Isn't that just:
>> > return cmpConstants(ConstL, ConstR);
>> OOOPS! Fixed.
>>
>> I'm working on rest of patches.
>> Thanks!
>>
>> -Stepan.
>>
>> Nick Lewycky wrote:
>>
>> Stepan Dyatkovskiy wrote:
>>
>> Hello Nick!
>>
>> Can we proceed to work on patch series?
>>
>>
>> Just in case, I have reattached them to this post.
>> Would you prefer to see them in
>> http://reviews.llvm.org ?
>>
>>
>> Oddly, I prefer attachments, but we can use
>> reviews.llvm.org <http://reviews.llvm.org> if you
>> prefer?
>>
>> 0001-constants-comparison.__patch
>>
>> + /// 1. Bitcastability.
>> + /// Check whether L's type could be losslessly
>> bitcasted to R's type.
>> + /// On this stage method, in case when lossless
>> bitcast is not
>> possible
>> + /// method returns -1 or 1, thus also defining which
>> type is
>> greater in
>> + /// context of bitcastability.
>>
>> Looking at the code, it looks like the order is:
>> 1. first class types, by typeid
>> 2. vectors (by element size? by number of elements?
>> the two
>> multiplied together?)
>>
>> + // Nor of TyL and TyR are values of first class
>> type. Return
>> + // its pointers comparison result.
>>
>> "Nor of TyL and TyR" -> "Neither Tyl nor TyR".
>>
>> Optional: "Return its pointers comparison result" ->
>> "Return the result
>> of comparing the types"?
>>
>> + if (TyL->getTypeID() == Type::X86_MMXTyID)
>> + TyLWidth = 64;
>> + else if (const VectorType *VecTyL =
>> dyn_cast<VectorType>(TyL))
>> + TyLWidth = VecTyL->getBitWidth();
>>
>> Yikes! We never merge function A using a vector and
>> function B using an
>> x86_mmx type together, right? Note what the LangRef says
>> about x86_mmx:
>> "The operations allowed on it are quite limited:
>> parameters and return
>> values, load and store, and bitcast."
>>
>> + case Value::UndefValueVal: return
>> cmpType(L->getType(),
>> R->getType());
>>
>> You already have L->getType() and R->getType() in TyL
>> and TyR. You also
>> already have cmpType(TyL, TyR) in TypesRes. Please just
>> reuse them in
>> the rest of this function.
>>
>>
>> 0002-values-comparison-__replacement-for-enumerate.__patch
>>
>> + // Assign serial numbers to values from left
>> function, and values
>> from
>> + // right function.
>> + // Explanation:
>> + // Being comparing functions we need to compare
>> values we meet at
>> left and
>> + // right sides.
>> + // Its easy to sort things out for external values.
>> It just should be
>> + // the same value at left and right.
>> + // But for local values (those were introduced inside
>> function body)
>> + // we have to ensure they were introduced at exactly
>> the same place,
>> + // and plays the same role. How to do it?
>> + // Just assign serial number to each value when we
>> meet it first
>> time.
>> + // Values that were met at same place will be with
>> same serial
>> numbers.
>>
>> This explanation is correct, but I'm curious whether we
>> guarantee that
>> we get the same numbering when comparing two functions
>> if I simply
>> reorder the BasicBlocks in one of the functions, will we
>> still assign
>> the same numberings or not?
>>
>> Also, it's not clear from the comment why we need the
>> complexity of
>> numbering them anyways. Why not just start at the
>> beginning and run to
>> the end, relying on the SSA dominance property to
>> guarantee that we
>> visit definitions before uses, and assume we'll visit
>> each value in the
>> same order (look at each instruction and operand in
>> order and just
>> complain about differences)? The answer is that PHI
>> nodes mean we will
>> visit some uses before some definitions.
>>
>> Anyways, changing this text is optional in my mind. I
>> think it could be
>> better, but it's not worth spending much time on it.
>>
>> +int FunctionComparator::cmpValues(__const Value *L,
>> const Value *R) {
>>
>> I wouldn't mind an explanation of the lexographic
>> ordering on this one
>> either. Constants before non-constants, then InlineAsm
>> (ordered by
>> address), then visit order.
>>
>> + if (L == R) return 0;
>> + // Compare C1 and the bitcast result.
>> + if (int Res = cmpConstants(ConstL, ConstR))
>> + return Res;
>> [...]
>> + return 0;
>>
>> Isn't that just:
>> return cmpConstants(ConstL, ConstR);
>> ?
>>
>> 0003-attributes-comparison.__patch
>>
>> +int FunctionComparator::__cmpStrings(StringRef L,
>> StringRef R) const {
>> + // Prevent heavy comparison, compare sizes first.
>> + if (int Res = cmpNumbers(L.size(), R.size()))
>> + return Res;
>> +
>> + return L.compare(R);
>> +}
>>
>> Optional: I'm curious whether that's actually worth it?
>> I checked
>> StringRef and it does the length comparison last. It
>> could be that you
>> know that your strings tend to be longer than StringRef
>> which tries to
>> be efficient in general.
>>
>> 0004-operations-comparison.__patch
>>
>> +int FunctionComparator::__cmpOperation(const
>> Instruction *L,
>>
>> Optional: Again, a description of the ordering would be
>> interesting.
>> Opcode number, then number of operands, then type, then
>> 'nsw/nuw/exact/volatile'.
>>
>> 0005-GEP-comparison.patch
>>
>> +int FunctionComparator::cmpGEP(__const GEPOperator
>> *GEPL,
>> + const GEPOperator
>> *GEPR) {
>>
>> + unsigned int ASL = GEPL->getPointerAddressSpace()__;
>>
>> Extra line after {
>>
>> if (DL) {
>> [...]
>> + unsigned BitWidth = DL ?
>> DL->getPointerSizeInBits(ASL) : 1;
>>
>> BitWidth doesn't need to test DL for null-ness.
>>
>> 0006-top-level-compare-method.__patch
>>
>> Perfect!
>>
>> 0007-sanity-check.patch
>>
>> + // Check symmetricity.
>>
>> "symmetricity" -> "symmetry".
>>
>> 0008-removed-unused-methods.__patch
>>
>> Perfect
>>
>> 0009-FnSet-replaced-with-__FnTree.patch
>>
>> Okay, I'm confused. We do all this work to produce a
>> 3-way comparison,
>> then we wrap it up in an operator< for a std::set that
>> will only look at
>> less-than, but never uses the equals result. We want to
>> do one
>> comparison between two functions, not two (ie., std::set
>> does
>> func1<func2 and func2<func1 to see whether they're equal
>> or less than,
>> and we don't need that since cmp(func1, func2) tells us
>> directly). Maybe
>> this comes in a future patch?
>>
>> One other question, I should probably know the answer to
>> this but it's
>> been a while. :) When we start detecting equal
>> functions, do we merge
>> them immediately? If so, can't that cause functions
>> already in the set
>> to have their equality comparisons change (ie., A calls
>> X, B calls Y, A
>> and B are in the set, then we merge X and Y for being
>> equal, now we've
>> changed A and B to both called the same merged function,
>> so they start
>> to compare the same). Sets tend not to like having their
>> comparison
>> functions change on elements that they contain.
>>
>> 0010-updated-comments.patch
>>
>> Okay.
>>
>> 0011-removed-DenseMap-helpers.__patch
>>
>> Yay deleting dead code!
>>
>> Nick
>>
>>
>> Thanks!
>> -Stepan.
>>
>> Stepan Dyatkovskiy wrote:
>>
>> ping
>> Stepan Dyatkovskiy wrote:
>>
>> Hi Nick,
>>
>> Please find the fixed patches in attachment.
>> Series starts from "constants comparison".
>>
>> Below small report of what has been fixed,
>> and answers on your
>> questions.
>>
>> cmpTypes:
>> > Please add a comment for this method.
>> Include the meaning of the
>> returned value. ("man strcmp" for
>> inspiration.)
>> Fixed and committed. So you can look in
>> trunk, may be I forgot to do
>> something (hope not :-) )
>>
>> checkForLosslessBitcast, cmpConstants:
>> > Why isn't this using cmpType?
>> Fixed.
>>
>> > Please put the else on the same line as
>> the closing brace.
>> Fixed.
>>
>> >> + else if (const VectorType *thisPTy =
>> dyn_cast<VectorType>(L))
>> > Missing initial capital
>> Sorry. Fixed. Actually these typos has been
>> cloned from
>> Type::canLosslesslyBitCastTo.
>>
>> >> + return cmpNumbers((uint64_t)L,
>> (uint64_t)R);
>> >>
>> > Please assert on unknown constant.
>> That's possible. But what if we really got
>> unknown constant? New
>> comming
>> constant types merely depends on
>> MergeFunctions implementation. So we
>> get crash if it will happen. And we loose
>> nothing comparing them
>> just as
>> pointers. So, do we really need an
>> assertion? Currently I kept it
>> as it
>> was. If your answer is "yes, we need it", it
>> would easy to add it:
>>
>> case Value::FunctionVal:
>> case Value::GlobalVariableVal:
>> - case Value::GlobalAliasVal:
>> - default: // Unknown constant
>> - return cmpNumbers((uint64_t)L,
>> (uint64_t)R);
>> + case Value::GlobalAliasVal:
>> + return cmpNumbers((uint64_t)L,
>> (uint64_t)R);
>> + default:
>> + llvm_unreachable("Unknown constant.");
>>
>> About variable names: Var1, Var2 vs
>> VarL,VarR.
>> I think its better to use L/R concept. Easer
>> to understand what to
>> return (-1, or 1) when you see L/R rather
>> than Var1/Var2.
>> Var1/Var2 has been kept for functions that
>> are to be removed till the
>> end of patch series.
>> F1 and F2 names were also kept till the "top
>> level compare method"
>> patch.
>>
>> > Alternatively, any reason not to have
>> cmpPointers(const void *L,
>> const void *R)?
>> I could do it. I just wanted to show that
>> pointers are compared
>> like a
>> numbers in some situations. Actually we do
>> it in cases when we have
>> absolutely no idea of smarter comparison.
>> And back to cmpPonters. Its rather about
>> what intuition tells. If
>> pointers are equal as numbers, could they be
>> different somehow? Could
>> they be non-castable to numbers on some
>> platforms? The last one is
>> big
>> trouble for DenseMapInfo implementation..
>> If there is any (even very small)
>> possibility of such cases - then
>> yes,
>> I vote for cmpPointers. The last word is up
>> to you anyway.
>>
>> Attributes (0005):
>> > Attributes already have operator< and
>> operator==. Please reuse
>> them.
>> Fixed. I used simple "if":
>>
>> if (LA < RA)
>> return -1;
>> if (RA < LA)
>> return 1;
>>
>> Though its possible to use:
>>
>> if (int Res = (LA < RA) ? -1 : (RA < LA) ? 1
>> : 0)
>> return Res;
>>
>> Which one is more preferable?
>>
>> cmpGEP (0007):
>> > + int cmpGEP(const GEPOperator *GEP1,
>> const GEPOperator *GEP2);
>> > + int cmpGEP(const GetElementPtrInst
>> *GEP1,
>> > + const GetElementPtrInst *GEP2) {
>> >
>> > Const members?
>> We can't make it constant, since it compares
>> values (cmpValues, aka
>> enumerate). So, if we see Value first time,
>> we have to add it into
>> sn_mapL/R.
>>
>> > I think you should put this under if (DL)
>> { /* declare Offset1,
>> Offset2, etc. */ }
>> Fixed.
>>
>> About 0008:
>> > Did you mean "cmpOperation"?
>> Yes. We could keep it in one place. I have
>> fixed this case and
>> removed
>> TODO.
>>
>> > "compare" still returns "bool". I'm going
>> to assume this was
>> meant to
>> go in 0009.
>> Yes.
>>
>> About 0011:
>> > std::set is frowned upon, see
>>
>> http://llvm.org/docs/__ProgrammersManual.html#set
>>
>> <http://llvm.org/docs/ProgrammersManual.html#set>
>> . Do you actually
>> rely
>> on the stable iterator guarantee? Or would
>> another set-like container
>> be a
>> better fit?
>> Huh.. I have looked for alternatives.
>> Unfortunately SmallSet is not
>> suitable, since we need to lookup existing
>> items, and SmallSet::find
>> method is not present. May be I have missed
>> something.
>> Actually we need binary tree implementation.
>> Do we have better
>> analogue?
>> I could implement new one. Think it should
>> be just one more patch
>> afterwards.
>>
>> >No. An identifier with underscore then
>> capital letter is reserved
>> for
>> the implementation. You can just call them
>> "F" and "DL", then the
>> ctor
>> initializer list can be "F(F), DL(DL)" and
>> that will work fine.
>> Fixed.
>>
>> 0013:
>> > Sorry, I'm not sure what this sentence
>> means? The thing your
>> example
>> is talking about is showing is that two
>> functions in an SCC may be
>> equivalent, and detecting them requires
>> "blinding" (ignoring) certain
>> details of the function when you do the
>> comparison. We blind
>> ourselves
>> to the pointee types, but not to callees of
>> functions in the same
>> SCC.
>>
>> I meant generalising of cross-reference case:
>> in old implementation, while comparing F and
>> G functions, we treat as
>> equal case when F uses G, with case when G
>> uses F (at the same
>> place).
>> It could happen that F uses G1, while G1
>> uses G2, while G2 uses F.
>> So it
>> still the same. But currently we have to
>> invent way how to detect
>> such
>> cases.
>>
>> Thanks for reviews!
>> -Stepan
>>
>> Stepan Dyatkovskiy wrote:
>>
>> Hi Nick,
>> I have committed 0001 as r203788.
>> I'm working on fixes for 0002 - 0014.
>>
>> > After reading through this patch
>> series, I feel like I'm missing
>> > something important. Where's the sort
>> function? It looks like
>> we're
>> > still comparing all functions to all
>> other functions.
>> When you insert functions into std::set
>> or its analo g s it does all
>> the
>> job for you. Since internally it builds
>> a binary tree using "less"
>> comparison, and each insert/look-up
>> operation takes O(log(N)) time.
>>
>> -Stepan.
>>
>> Nick Lewycky wrote:
>>
>> On 27 February 2014 08:23, Stepan
>> Dyatkovskiy <stpworld at narod.ru
>> <mailto:stpworld at narod.ru>
>> <mailto:stpworld at narod.ru
>> <mailto:stpworld at narod.ru>>> wrote:
>>
>> Hi Nick,
>>
>> I tried to rework changes as you
>> requested. One of patches (0004
>> with extra assertions) has been
>> removed.
>>
>>
>> > + bool isEquivalentType(Type
>> *Ty1, Type *Ty2) const {
>> > + return cmpType(Ty1, Ty2) == 0;
>> > + }
>> >
>> > Why do we still need
>> isEquivalentType? Can we nuke this?
>> Yup. After applying all the patches
>> isEquivalentType will be
>> totally
>> replaced with cmpType. All isEqXXXX
>> and friends will be removed in
>> 0011 (old 0012). No traces left.
>> Old function wasn't removed in 0001
>> just for keeping patches
>> without
>> extra noise like:
>>
>> - something that uses
>> isEquivalentType
>> + something that uses cmpType
>>
>> The point is, that "something" that
>> uses isEquivalentType, will be
>> also replaced with one of next
>> patches in this series.
>>
>>
>> >
>> > +static int cmpNumbers(uint64_t
>> L, uint64_t R) {
>> > + if (L < R) return -1;
>> > + if (L > R) return 1;
>> > + return 0;
>> > +}
>> >
>> > At a high level, you don't
>> actually need a <=> operator to use a
>> sort. A
>> > strict ordering ( < operator) is
>> sufficient. (Note that for
>> mergefunc, a
>> > strict weak ordering is not
>> sufficient, it must be a total
>> ordering.)
>> That could be done with int
>> FunctionComparator::compare(). We can
>> replace it with bool
>> FunctionComparator::less(). Though
>> for all
>> other cmp methods need at least 2
>> bits of information as result:
>> 1. Whether things are equal.
>> 2. Whether left less than right.
>>
>> As for
>> FunctionComparator::compare(),
>> conversion into less() will
>> increase time of sanity check (patch
>> #0010).
>> Sanity check is just a sweet bonus.
>> It checks that ordering
>> implemented properly (checks order
>> relation properties).
>> Turning compare() into less() mean,
>> that we'll have to run
>> comparison two times: L.less(R) and
>> R.less(L). But may be sanity
>> check is not a thing to be published
>> at all.
>>
>>
>> >
>> > Consider hoisting this inside the
>> FunctionComparator class? That
>> class
>> > should have a bunch of
>> implementations of comparisons
>> between
>> various
>> > different things, which can pass
>> down to other methods in the
>> same class.
>> In new patch series attached to this
>> post, I have moved all static
>> methods into FunctionComparator.
>>
>>
>> > + // Replacement for
>> type::canLosslesslyBitCastTo, that
>> > + // establish order relation on
>> this kind of properties.
>> > + int
>> checkForLosslessBitcast(const Type
>> *L, const Type *R);
>> >
>> > Type:: not type:: . Please make
>> this comment more descriptive.
>> Done.
>> [new comment]
>> Replacement for
>> Type::canLosslesslyBitCastTo, that
>>
>> establish order relation on this
>> kind of properties
>> Returns 0, if L and R types could be
>> converted to each other
>> without
>> reinterpretation of bits.
>> Otherwise method returns -1 or 1,
>> defining total ordering between
>> types in context of lossless
>> bitcastability trait.
>> E.g.: if L is less than R (result is
>> -1), than every type that
>> could be
>> losslessly bitcasted to L is less
>> than R.
>> [/new comment]
>>
>>
>> >
>> > + /// Replacement for C1 ==
>> getBitCast(C2, C1Ty)
>> > + /// Its more controllable, and
>> supposed to be simpler and
>> more
>> > predictionable.
>> > + /// As very important
>> advantage: it allows to introduce
>> order
>> relation on
>> > + /// constants set, and thus use
>> it as trait in refinement
>> routines.
>> >
>> > "Its" --> "It's".
>> "predictionable" --> "predictable".
>> And how is
>> it more
>> > predictable? I think this comment
>> would be better if it
>> described the
>> > function instead of making
>> comparisons between it and other
>> functions.
>> > Something like, "Compare
>> constants under a system where
>> pointer
>> to X and
>> > pointer to Y are considered
>> equal" or whatever is actually true
>> here.
>> Done.
>> [new comment]
>>
>> Replacement for C1 == getBitCast(C2,
>> C1Ty)
>> Parses constants contents, assuming
>> that types are losslessly
>> bitcasted between each other. So
>> actually it ignores types and only
>> compares bits from L and R.
>> Returns 0, if L and R has equivalent
>> content.
>> -1 or 1 if values are different.
>> Maintaining total ordering
>> requires
>> two values that indicates
>> non-equivalence (L less R, L
>> greater R).
>> [/new comment]
>>
>>
>> >
>> > +enum ConstantType {
>> > I'm not sure that this buys you
>> much. All the "isa" tests can be
>> broken
>> > down into a switch on
>> getValueID() with the one
>> exception of
>> isNullValue().
>> Done.
>>
>>
>> > + assert(
>> > +
>>
>> C1->getType()->____canLosslesslyBitCastTo(C2->____getType())
>> &&
>> > + "Pass is healthless.
>> checkForLosslessBitcast should be
>> twin of "
>> > + "canLosslesslyBitCastTo method,
>> except the case when the
>> last one "
>> > + "returns false, the first one
>> should return -1 or 1");
>> ...
>>
>> > I think we can skip the asserts
>> here. They aren't detecting a
>> specific
>> > bug, they're checking whether the
>> new code does a certain task
>> relative
>> > to the old code. Drop the old
>> code, your new code is the new
>> sheriff in
>> > town.
>> Asserts has been removed.
>>
>>
>> >
>> > + DenseMap<const Value*, int>
>> sn_map1, sn_map2;
>> >
>> > What is sn short for ("seen")?
>> Why are there two of these?
>> Serial number :-)
>>
>> >
>> > + std::pair<DenseMap<const Value
>> *, int>::iterator, bool>
>> > + LeftSN =
>> sn_map1.insert(std::make_pair(____V1,
>> sn_map1.size())),
>> > + RightSN =
>> sn_map2.insert(std::make_pair(____V2,
>> sn_map2.size()));
>> >
>> > So I think I get it, this is easy
>> to reason about. You number
>> each value
>> > going down both functions. But I
>> think you can eliminate one of
>> these
>> > maps because you know that if the
>> left and right were ever
>> different
>> > then we terminate the comparison
>> immediately. You can at least
>> share one
>> > map with both V1 and V2, but I
>> think you can reduce the number
>> of map
>> > insertions.
>> Not sure. Consider that in left you
>> have:
>> %A = alloca i32
>> %B = alloca i32
>> store i32 1, i32* %A
>>
>> And in right:
>> %A = alloca i32
>> %B = alloca i32
>> store i32 1, i32* %B
>>
>> When we meet 'store' instruction we
>> need to check in which order %A
>> was allocated at left and in which
>> order %B was allocated at right.
>> So for each value in function we
>> have to assign serial number. Then
>> comparing to local values from
>> different functions we can just
>> check
>> their serial numbers.
>> Though, may be I missed something?
>> May be there is another way to
>> determine "who was first".
>>
>>
>> > High-level question, are
>> attributes ordered? Your
>> comparison is
>> ordered.
>> > So if I have function F with
>> string attributes "sse2",
>> "gc=hemisphere"
>> > and another function G with
>> string attributes "gc=hemisphere",
>> "sse2"
>> > then they would be considered
>> different. I don't think that's
>> right.
>> I didn't check it. But if it is not
>> ordered, we could order it
>> lexicographically.
>>
>>
>> >
>> > + int cmpOperation(const
>> Instruction *I1, const Instruction
>> *I2)
>> const;
>> > +
>> > bool isEquivalentOperation(const
>> Instruction *I1,
>> > - const Instruction *I2) const;
>> > + const Instruction *I2) const {
>> > + return cmpOperation(I1, I2)
>> == 0;
>> > + }
>> >
>> > Hopefully this patch series
>> ultimately eliminates calls of
>> > isEquivalentOperation?
>> Of course. Just like
>> isEquivalentType.
>>
>>
>> > By the way, this is an
>> interesting one. Should "add x, y"
>> and
>> "add nsw
>> > x, y" be equal or not?
>> Yeah. Those are interesting
>> questions. We could even treat
>> 'mul a,
>> 2' and 'shl a,1' as equal in some
>> cases. I think it's matter of
>> time
>> and further optimization.
>>
>> Please find first reworked patches
>> in attachment.
>>
>>
>> I am extremely concerned that you
>> may end up with A < B and B < C
>> but A
>> > C. There are cases where you use
>> cmpTypes to compare two types,
>> then
>> others where you compare them via
>> cmpNumbers. The only way to
>> ensure
>> that the sort function is
>> self-consistent is to be entirely
>> consistent
>> with how we traverse the function.
>>
>> 0001:
>> + int cmpType(Type *Ty1, Type *Ty2)
>> const;
>> +
>>
>> Please add a comment for this
>> method. Include the meaning of the
>> returned value. ("man strcmp" for
>> inspiration.)
>>
>> + static int cmpNumbers(uint64_t L,
>> uint64_t R);
>>
>> Optional: is there any benefit to
>> making this static instead of a
>> const
>> method? It doesn't need access to
>> the 'this' pointer, but it seems
>> like
>> that's an incidental artifact of the
>> implementation. The other cmp*
>> members are const methods.
>>
>> + return
>> cmpNumbers(FTy1->isVarArg(),
>> FTy2->isVarArg());;
>>
>> Extra semi-colon.
>>
>> I trust you to apply the fixes
>> above, you can choose to commit the
>> patch
>> without going through another round
>> of review with me.
>>
>> 0002:
>>
>> +int
>>
>> FunctionComparator::__checkForLosslessBitcast(const
>> Type *L,
>> const
>> Type *R) {
>> ...
>> + int TypesRes =
>> cmpNumbers((uint64_t) L,
>> (uint64_t) R);
>> ...
>> + return TypesRes;
>>
>> Why isn't this using cmpType?
>> Alternatively, any reason not to have
>> cmpPointers(const void *L, const
>> void *R)?
>>
>> + }
>> + else {
>>
>> Please put the else on the same line
>> as the closing brace.
>>
>> + else if (const VectorType *thisPTy
>> = dyn_cast<VectorType>(L))
>>
>> Missing initial capital, see
>>
>> http://llvm.org/docs/__CodingStandards.html#name-__types-functions-variables-and-__enumerators-properly
>>
>>
>> <http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly>
>>
>>
>>
>>
>>
>>
>>
>>
>> . Also, I find it odd that you
>> switch from "L" and "R" to "this"
>> and
>> "that". Please make those
>> consistent, probably on left and
>> right.
>>
>> 0003:
>>
>> + int cmpConstants(const Constant
>> *L, const Constant *R);
>>
>> Any reason this method isn't const?
>>
>> +int
>>
>> FunctionComparator::__cmpConstants(const
>> Constant *L, const
>> Constant
>> *R) {
>> +
>> + // Pack null value as one more
>> Value ID
>> + unsigned LType = L->isNullValue()
>> ? 0 : L->getValueID() + 1;
>> + unsigned RType = R->isNullValue()
>> ? 0 : R->getValueID() + 1;
>> +
>> + if (int Res = cmpNumbers(LType,
>> RType))
>> + return Res;
>> +
>> + if (LType == 0)
>> + return 0;
>> +
>> + switch (LType-1) {
>>
>> Optional: how about:
>>
>> if (L->isNullValue() &&
>> R->isNullValue())
>> return cmpType(L->getType(),
>> R->getType());
>> if (L->isNullValue() &&
>> !R->isNullValue())
>> return 1;
>> if (!L->isNullValue() &&
>> R->isNullVaue())
>> return -1;
>>
>> if (int Res =
>> cmpNumbers(L->getValueID(),
>> R->getValueID()))
>> return Res;
>>
>> switch (L->getValueID()) {
>>
>> ? In particular I'm not a fan of how
>> LType/RType are equal to
>> valueID+1,
>> even with the comment.
>>
>> + case Value::FunctionVal:
>> + case Value::GlobalVariableVal:
>> + case Value::GlobalAliasVal:
>> + default: // Unknown constant
>> + return cmpNumbers((uint64_t)L,
>> (uint64_t)R);
>>
>> Please assert on unknown constant.
>>
>> How are function, global variable
>> and alias reachable here?
>>
>> 0004:
>>
>> Thanks for the comment on sn_mapL/R!
>>
>> +int
>>
>> FunctionComparator::__cmpEnumerate(const
>> Value *V1, const Value
>> *V2) {
>>
>> "Compare enumerate"? This name
>> doesn't make sense to me. "cmpValue"
>> perhaps?
>>
>> + const Constant *C1 =
>> dyn_cast<Constant>(V1);
>> + const Constant *C2 =
>> dyn_cast<Constant>(V2);
>> + if (C1 && C2) {
>> + if (V1 == V2) return 0;
>> // TODO: constant expressions with
>> GEP or references to F1 or F2.
>> - if (C1->isNullValue() &&
>> C2->isNullValue() &&
>> - isEquivalentType(C1->getType()__,
>> C2->getType()))
>> - return true;
>> - // Try bitcasting C2 to C1's type.
>> If the bitcast is legal and
>> returns C1
>> - // then they must have equal bit
>> patterns.
>> - return
>>
>> checkForLosslessBitcast(C1->__getType(),
>> C2->getType()) ==
>> 0 &&
>> - cmpConstants(C1, C2) == 0;
>> + if (C1->isNullValue() &&
>> C2->isNullValue()) {
>> + if (int Res =
>> cmpType(C1->getType(),
>> C2->getType()))
>> + return Res;
>> + }
>> +
>> + // Check whether C2 and C1 has
>> equal bit patterns.
>> + if (int Res =
>>
>> checkForLosslessBitcast(C1->__getType(),
>> C2->getType()))
>> + return Res;
>> +
>> + // Compare C1 and the bitcast
>> result.
>> + if (int Res = cmpConstants(C1, C2))
>> + return Res;
>> +
>> + return 0;
>> }
>> + if (C1)
>> + return -1;
>> + if (C2)
>> + return 1;
>>
>> I'm confused why this isn't just
>> using cmpConstants? I think the
>> isNullValue to cmpType check is
>> already in cmpConstants, why not
>> move
>> the checkForLosslessBitcast there?
>> Is cmpConstants used from
>> elsewhere
>> and needs to not have that check?
>>
>> + std::pair<DenseMap<const Value *,
>> int>::iterator, bool>
>> + LeftSN =
>> sn_mapL.insert(std::make_pair(__V1,
>> sn_mapL.size())),
>> + RightSN =
>> sn_mapR.insert(std::make_pair(__V2,
>> sn_mapR.size()));
>>
>> "auto"?
>>
>> 0005:
>>
>> Okay, this makes attributes ordered.
>>
>> + enum AttrType {
>> + Enum,
>> + Align,
>> + Other
>> + } LeftAttrType = Other,
>> RightAttrType = Other;
>> +
>> + if (LA.isAlignAttribute())
>> LeftAttrType = Align;
>> + else if (LA.isEnumAttribute())
>> LeftAttrType = Enum;
>> + if (RA.isAlignAttribute())
>> RightAttrType = Align;
>> + else if (RA.isEnumAttribute())
>> RightAttrType = Enum;
>> +
>> + if (int Res =
>> cmpNumbers(LeftAttrType,
>> RightAttrType))
>> + return Res;
>> +
>> + switch (LeftAttrType) {
>> + case Enum:
>> + if (int Res =
>> cmpNumbers(LA.getKindAsEnum(),
>> RA.getKindAsEnum()))
>> + return Res;
>> + break;
>> + case Align:
>> + if (int Res =
>> cmpNumbers(LA.getValueAsInt(),
>> RA.getValueAsInt()))
>> + return Res;
>> + break;
>> + case Other:
>> + if (int Res =
>> cmpStrings(LA.getKindAsString(__),
>> RA.getKindAsString()))
>> + return Res;
>> + if (int Res =
>> cmpStrings(LA.__getValueAsString(),
>> RA.getValueAsString()))
>> + return Res;
>> + break;
>> + }
>>
>> Attributes already have operator<
>> and operator==. Please reuse
>> them.
>>
>> 0006:
>>
>> This looks fine.
>>
>> 0007:
>>
>> + int cmpGEP(const GEPOperator
>> *GEP1, const GEPOperator *GEP2);
>> + int cmpGEP(const GetElementPtrInst
>> *GEP1,
>> + const GetElementPtrInst *GEP2) {
>>
>> Const members?
>>
>> + unsigned BitWidth = DL ?
>> DL->getPointerSizeInBits(AS1) : 1;
>> + APInt Offset1(BitWidth, 0),
>> Offset2(BitWidth, 0);
>> + if (DL &&
>> +
>>
>> (GEP1->__accumulateConstantOffset(*DL,
>> Offset1) &&
>> +
>>
>> GEP2->__accumulateConstantOffset(*DL, Offset2)))
>> + return cmpAPInt(Offset1, Offset2);
>>
>> I think you should put this under if
>> (DL) { /* declare Offset1,
>> Offset2,
>> etc. */ }
>>
>> 0008:
>>
>> + // TODO: Already checked in cmpOps
>>
>> Did you mean "cmpOperation"?
>>
>> - if (!enumerate(F1BB, F2BB) ||
>> !compare(F1BB, F2BB))
>> + if (!enumerate(F1BB, F2BB) ||
>> compare(F1BB, F2BB) != 0)
>>
>> "compare" still returns "bool". I'm
>> going to assume this was meant
>> to go
>> in 0009.
>>
>> 0009:
>>
>> Looks fine.
>>
>> 0010: not reviewing this now.
>>
>> 0011:
>>
>> Yay!
>>
>> 0012:
>>
>> + typedef std::set<FunctionPtr>
>> FnTreeType;
>>
>> std::set is frowned upon, see
>>
>> http://llvm.org/docs/__ProgrammersManual.html#set
>>
>> <http://llvm.org/docs/ProgrammersManual.html#set>
>> . Do you actually
>> rely
>> on the stable iterator guarantee? Or
>> would another set-like
>> container be
>> a better fit?
>>
>> + FunctionPtr(Function *_F, const
>> DataLayout *_DL) : F(_F),
>> DL(_DL) {}
>>
>> No. An identifier with underscore
>> then capital letter is reserved
>> for
>> the implementation. You can just
>> call them "F" and "DL", then the
>> ctor
>> initializer list can be "F(F),
>> DL(DL)" and that will work fine.
>>
>> 0013:
>>
>> +// a ² a (reflexivity)
>>
>> I'm seeing a "squared" between the
>> two 'a's. Could you represent
>> this in
>> plain ASCII?
>>
>> +// Comparison iterates through each
>> instruction in each basic
>> block.
>>
>> Two spaces before "iterates" should
>> be one.
>>
>> +// While ability to deal with
>> complex references could be really
>> perspective.
>>
>> Sorry, I'm not sure what this
>> sentence means? The thing your
>> example is
>> talking about is showing is that two
>> functions in an SCC may be
>> equivalent, and detecting them
>> requires "blinding" (ignoring)
>> certain
>> details of the function when you do
>> the comparison. We blind
>> ourselves
>> to the pointee types, but not to
>> callees of functions in the same
>> SCC.
>>
>> 0014:
>>
>> Yay again!
>>
>>
>> After reading through this patch
>> series, I feel like I'm missing
>> something important. Where's the
>> sort function? It looks like we're
>> still comparing all functions to all
>> other functions.
>>
>> Nick
>>
>>
>>
>> _________________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu
>> <mailto:LLVMdev at cs.uiuc.edu>
>> http://llvm.cs.uiuc.edu
>>
>> http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>>
>> <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>>
>>
>>
>> _________________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>> http://llvm.cs.uiuc.edu
>>
>> http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>>
>> <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>>
>>
>>
>>
>> _________________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> <mailto:llvm-commits at cs.uiuc.edu>
>>
>> http://lists.cs.uiuc.edu/__mailman/listinfo/llvm-commits
>>
>> <http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits>
>>
>>
>>
>>
>> _________________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
>> http://lists.cs.uiuc.edu/__mailman/listinfo/llvm-commits
>> <http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits>
>>
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-attributes-comparison.patch
Type: text/x-diff
Size: 2221 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140515/d44b56b3/attachment.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0004-operations-comparison.patch
Type: text/x-diff
Size: 11004 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140515/d44b56b3/attachment-0001.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0005-GEPs-comparison.patch
Type: text/x-diff
Size: 4070 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140515/d44b56b3/attachment-0002.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0006-top-level-comparisons.patch
Type: text/x-diff
Size: 9936 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140515/d44b56b3/attachment-0003.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0007-sanity-check.patch
Type: text/x-diff
Size: 3933 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140515/d44b56b3/attachment-0004.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0008-removed-unused-methods.patch
Type: text/x-diff
Size: 2231 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140515/d44b56b3/attachment-0005.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0009-FnSet-replaced-with-FnTree.patch
Type: text/x-diff
Size: 6834 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140515/d44b56b3/attachment-0006.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0010-updated-comments.patch
Type: text/x-diff
Size: 3322 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140515/d44b56b3/attachment-0007.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0011-removed-DenseMap-helpers.patch
Type: text/x-diff
Size: 4629 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140515/d44b56b3/attachment-0008.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: apply-patches.sh
Type: application/x-sh
Size: 355 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140515/d44b56b3/attachment.sh>
More information about the llvm-commits
mailing list