[LLVMdev] MergeFunctions: reduce complexity to O(log(N))
Stepan Dyatkovskiy
stpworld at narod.ru
Wed Jun 4 15:57:55 PDT 2014
Hi Nick,
What about rest of patches? I rescanned remained patches, updated them
(according to trunk). I have also added few more comments in 0010 about
how finally we introduce binary tree and N*log(N) time.
All the patches are reattached to this post.
Thanks!
-Stepan.
Stepan Dyatkovskiy wrote:
> Hi Nick,
> Thanks for reviews!
>
> I have committed 0003 - 0005 as 208953, 208973 and 208976.
>
> > + // TODO: Already checked in cmpOps
> Yup. I have fixed it with "Already checked in cmpOperation".
> The reason I put this TODO, is because we don't need this check,
> operation types were already checked in cmpOperation. But I didn't
> remove this code, because its not mine (it was in old code as well), and
> actually it is subject of additional cosmetic commit.
>
> Thanks!
> -Stepan
>
> Nick Lewycky wrote:
>> 0003:
>>
>> + // Mostly cloned from BitcodeWriter, but simplified a bit.
>>
>> Please remove that comment, it isn't adding anything for the reader.
>>
>> And submit.
>>
>> 0004:
>>
>> I *really* like the comment block!! This is going to save developers so
>> much time!
>>
>> No changes, please submit!
>>
>> 0005:
>>
>> Perfect. Please submit.
>>
>> 0006:
>>
>> + // TODO: Already checked in cmpOps
>>
>> cmpOps?
>>
>> Nick
>>
>> On 15 May 2014 06:32, Stepan Dyatkovskiy <stpworld at narod.ru
>> <mailto:stpworld at narod.ru>> wrote:
>>
>> Hi Nick,
>> What about rest of patches? I have fixed 0004 and 0005, others seems
>> to be OK for you.. Can I commit them?
>>
>> Thanks!
>> -Stepan
>>
>> Stepan Dyatkovskiy wrote:
>>
>> Hi Nick,
>> Thanks you for reviews. I have corrected for of comments from
>> FAQ to
>> what seems to be regular in LLVM.
>> Patches has been committed as r208173 and r208189.
>>
>> Thanks!
>> -Stepan
>>
>> Nick Lewycky wrote:
>>
>> On 29 April 2014 04:36, Stepan Dyatkovskiy
>> <stpworld at narod.ru <mailto:stpworld at narod.ru>
>> <mailto:stpworld at narod.ru <mailto:stpworld at narod.ru>>> wrote:
>>
>> Hi Nick.
>>
>> I have fixed and reattached all the patches from 0001
>> till 0011.
>>
>> 0001, 0002:
>> I have added detailed order descriptions to the methods
>> declaration.
>> I have also fixed everything as you asked.
>>
>>
>> 0001:
>>
>> + for (uint64_t i = 0; i < NumElementsL; ++i) {
>> vs.
>> + for (unsigned i = 0, e = NumElementsL; i != e; ++i) {
>>
>> This difference is necessary to avoid a warning? I don't
>> actually care
>> which form you use here given that NumElementsL is already
>> an integer
>> there's no recalculation we're avoiding like getting the end
>> iterator
>> each time, I just want it to look consistent. If it's
>> working around a
>> warning that's more important than consistent looks.
>>
>> 0001 looks good. Please commit it.
>>
>> 0002:
>>
>> One note, you use a writing style of asking a question,
>> followed by
>> answering it. For instance, "Will we still get the same
>> numbering? As
>> follows from FunctionComparator::compare(), we do CFG walk,
>> ...". I
>> don't think it's worth your time to rewrite it now, but we
>> generally
>> don't use that construction in llvm's comments, instead
>> preferring to
>> explain what the code does an including an argument that
>> explains why
>> the approach is correct. This amounts to the same thing,
>> it's just a
>> different style.
>>
>> + /// their functions. If right value was meet
>> first during
>> scanning,
>>
>> "was meet" --> "was met"
>>
>> Please commit it.
>>
>> Nick
>>
>>
>> 0003, cmpStrings
>> StringRef::compare performs lexicographical comparison,
>> and it isn't
>> the fastest one.
>> In case, when strings has different length, it still
>> compares their
>> prefixes lexicographically.
>> We need ordering among strings set, but we don't need
>> to compare
>> strings contents lexicographically all the time. So we
>> can compare
>> their sizes first, and do lexicographical comparison
>> only when we
>> really need it: only when strings are equal in size.
>>
>> I have added short version of these explanations into
>> cmpStrings. So
>> we could get really well documented methods based on
>> our posts ;-)
>> Those were good questions and definitely the answers
>> should be in
>> documentation somewhere.
>>
>> 0004: I have added ordering description.
>> 0005: Fixed.
>> 0007: Fixed.
>> 0009:
>>
>> > Okay, I'm confused. We do all this work to produce a
>> 3-way
>> comparison,
>> > then we wrap it up in an operator< for a std::set
>> that will only
>> look at
>> > less-than...
>> Actually, on top level we don't need 3-way comparison.
>> But we do
>> need it to determine whether functions are equal on
>> some stage, just
>> to proceed to the next "less significant" stage.
>> Consider two functions (where BB1 and BB11 are equal)
>>
>> "F1: {BB1, BB2}" and
>> "F2: {BB11, BB3}".
>>
>> First we have to realize that BB1 and BB11 are equal.
>> And then
>> determine which one of BB2 and BB3 is less than another.
>> So finally we got 3-way comparison: less, equal and
>> greater (in fact
>> just neither less nor equal).
>>
>>
>> > When we start detecting equal functions, do we
>> merge them
>> immediately? If so, can't that cause functions already
>> in the set to
>> have their equality comparisons change
>> Yes we merge them immediately. And no, it won't cause
>> situation you
>> have mentioned. Once we changed function X, we lookup
>> for functions
>> that uses X, remove them from FnTree (former FnSet) and
>> put them
>> into Deferred list. Then we fill up worklist with
>> Deferred contents
>> and repeat comparison routine for these functions.
>>
>>
>> Thanks for reviews!
>>
>> -Stepan
>>
>> Stepan Dyatkovskiy wrote:
>>
>> ping
>> Stepan Dyatkovskiy wrote:
>>
>> Hi Nick,
>> Thank you for reviews! Here is my answers for
>> first two
>> patches.
>>
>> Though as additional, I think its important
>> answer to this
>> question
>> first:
>> > Okay, I'm confused. We do all this work to
>> produce a
>> 3-way comparison,
>> > then we wrap it up in an operator< for a
>> std::set that
>> will only
>> look at
>> > less-than...
>> Actually, on top level we don't need 3-way
>> comparison. But
>> we do need it
>> to determine whether functions are equal on
>> some stage just
>> to proceed
>> to the next "less meaningful" stage.
>> Consider two functions
>> "F1: {BB1, BB2}" and
>> "F2: {BB-equal-to-BB1, BB3}".
>> First we have to realize that BB1 and
>> BB-equal-to-BB1 are
>> equal. And
>> then determine which one of BB2 and BB3 is less.
>>
>> First patch.
>>
>> > + /// 1. Bitcastability.
>> > + /// Check whether L's type could be
>> losslessly
>> bitcasted to R's
>> type.
>> > + /// On this stage method, in case when
>> lossless
>> bitcast is not
>> possible
>> > + /// method returns -1 or 1, thus also
>> defining which
>> type is
>> greater in
>> > + /// context of bitcastability.
>> I have added detailed order description for
>> this method. You
>> also can
>> find this description in
>> cmpConstantsComment.txt.
>>
>> > Optional: "Return its pointers comparison
>> result" ->
>> "Return the
>> result
>> > of comparing the types"?
>> Oops! Fixed.
>>
>> > + if (TyL->getTypeID() ==
>> Type::X86_MMXTyID)
>> > + TyLWidth = 64;
>> > + else if (const VectorType *VecTyL =
>> dyn_cast<VectorType>(TyL))
>> > + TyLWidth = VecTyL->getBitWidth();
>> >
>> > Yikes! We never merge function A using a
>> vector and
>> function B
>> using an
>> > x86_mmx type together, right?
>> Fixed!
>>
>> > "Nor of TyL and TyR" -> "Neither Tyl nor
>> TyR".
>> >
>> > Optional: "Return its pointers comparison
>> result" ->
>> "Return the
>> result
>> > of comparing the types"?
>> Fixed.
>>
>> > + case Value::UndefValueVal: return
>> cmpType(L->getType(),
>> R->getType());
>> >
>> > You already have L->getType() and
>> R->getType() in TyL
>> and TyR.
>> Fixed. Sorry.
>>
>> Second patch.
>>
>> > I'm curious whether we guarantee that
>> > we get the same numbering when comparing
>> two functions
>> if I simply
>> > reorder the BasicBlocks in one of the
>> functions, will we
>> still assign
>> > the same numberings or not?
>> ...
>> > Why not just start at the beginning and
>> run to
>> > the end, relying on the SSA dominance
>> property.
>> Good questions!
>>
>> For first one I have added these notes:
>> [quote]
>> As follows from FunctionComparator::compare(),
>> we do CFG
>> walk, we start
>> from entry, and then take each terminator. So
>> it doesn't
>> matter how in
>> fact BBs are ordered in function. And since
>> cmpValues are
>> called during
>> this walk, the numbering depends only on how
>> BBs located
>> inside the CFG.
>> [/quote]
>>
>> As for second question, I have added next notes:
>>
>> [quote]
>> We also can't use dominance properties.
>> Consider situation:
>> If we compare two instruction operands: first
>> is usage of
>> local variable
>> AL from function FL, and second is usage of
>> local variable
>> AR from FR.
>> We are still not able to compare operands of
>> PHI, since
>> those could be
>> operands from further BBs we didn't scan yet.
>> [/quote]
>>
>> You can find comments also in
>> sn_mapLRComment.txt
>> I have also described lexicographical ordering
>> for cmpValues
>> in method
>> comments. I have also put them in separated txt
>> file:
>> cmpValuesComment.txt.
>>
>> > + if (L == R) return 0;
>> > + // Compare C1 and the bitcast result.
>> > + if (int Res = cmpConstants(ConstL,
>> ConstR))
>> > + return Res;
>> > [...]
>> > + return 0;
>> >
>> > Isn't that just:
>> > return cmpConstants(ConstL, ConstR);
>> OOOPS! Fixed.
>>
>> I'm working on rest of patches.
>> Thanks!
>>
>> -Stepan.
>>
>> Nick Lewycky wrote:
>>
>> Stepan Dyatkovskiy wrote:
>>
>> Hello Nick!
>>
>> Can we proceed to work on patch series?
>>
>>
>> Just in case, I have reattached them to
>> this post.
>> Would you prefer to see them in
>> http://reviews.llvm.org ?
>>
>>
>> Oddly, I prefer attachments, but we can use
>> reviews.llvm.org <http://reviews.llvm.org>
>> <http://reviews.llvm.org> if you
>> prefer?
>>
>> 0001-constants-comparison.____patch
>>
>> + /// 1. Bitcastability.
>> + /// Check whether L's type could be
>> losslessly
>> bitcasted to R's type.
>> + /// On this stage method, in case when
>> lossless
>> bitcast is not
>> possible
>> + /// method returns -1 or 1, thus also
>> defining which
>> type is
>> greater in
>> + /// context of bitcastability.
>>
>> Looking at the code, it looks like the
>> order is:
>> 1. first class types, by typeid
>> 2. vectors (by element size? by number
>> of elements?
>> the two
>> multiplied together?)
>>
>> + // Nor of TyL and TyR are values of
>> first class
>> type. Return
>> + // its pointers comparison result.
>>
>> "Nor of TyL and TyR" -> "Neither Tyl nor
>> TyR".
>>
>> Optional: "Return its pointers comparison
>> result" ->
>> "Return the result
>> of comparing the types"?
>>
>> + if (TyL->getTypeID() ==
>> Type::X86_MMXTyID)
>> + TyLWidth = 64;
>> + else if (const VectorType *VecTyL =
>> dyn_cast<VectorType>(TyL))
>> + TyLWidth = VecTyL->getBitWidth();
>>
>> Yikes! We never merge function A using a
>> vector and
>> function B using an
>> x86_mmx type together, right? Note what the
>> LangRef says
>> about x86_mmx:
>> "The operations allowed on it are quite
>> limited:
>> parameters and return
>> values, load and store, and bitcast."
>>
>> + case Value::UndefValueVal: return
>> cmpType(L->getType(),
>> R->getType());
>>
>> You already have L->getType() and
>> R->getType() in TyL
>> and TyR. You also
>> already have cmpType(TyL, TyR) in TypesRes.
>> Please just
>> reuse them in
>> the rest of this function.
>>
>>
>>
>> 0002-values-comparison-____replacement-for-enumerate.____patch
>>
>> + // Assign serial numbers to values
>> from left
>> function, and values
>> from
>> + // right function.
>> + // Explanation:
>> + // Being comparing functions we need to
>> compare
>> values we meet at
>> left and
>> + // right sides.
>> + // Its easy to sort things out for
>> external values.
>> It just should be
>> + // the same value at left and right.
>> + // But for local values (those were
>> introduced inside
>> function body)
>> + // we have to ensure they were
>> introduced at exactly
>> the same place,
>> + // and plays the same role. How to do it?
>> + // Just assign serial number to each
>> value when we
>> meet it first
>> time.
>> + // Values that were met at same place
>> will be with
>> same serial
>> numbers.
>>
>> This explanation is correct, but I'm
>> curious whether we
>> guarantee that
>> we get the same numbering when comparing
>> two functions
>> if I simply
>> reorder the BasicBlocks in one of the
>> functions, will we
>> still assign
>> the same numberings or not?
>>
>> Also, it's not clear from the comment why
>> we need the
>> complexity of
>> numbering them anyways. Why not just start
>> at the
>> beginning and run to
>> the end, relying on the SSA dominance
>> property to
>> guarantee that we
>> visit definitions before uses, and assume
>> we'll visit
>> each value in the
>> same order (look at each instruction and
>> operand in
>> order and just
>> complain about differences)? The answer is
>> that PHI
>> nodes mean we will
>> visit some uses before some definitions.
>>
>> Anyways, changing this text is optional in
>> my mind. I
>> think it could be
>> better, but it's not worth spending much
>> time on it.
>>
>> +int
>> FunctionComparator::cmpValues(____const Value *L,
>> const Value *R) {
>>
>> I wouldn't mind an explanation of the
>> lexographic
>> ordering on this one
>> either. Constants before non-constants,
>> then InlineAsm
>> (ordered by
>> address), then visit order.
>>
>> + if (L == R) return 0;
>> + // Compare C1 and the bitcast result.
>> + if (int Res = cmpConstants(ConstL,
>> ConstR))
>> + return Res;
>> [...]
>> + return 0;
>>
>> Isn't that just:
>> return cmpConstants(ConstL, ConstR);
>> ?
>>
>> 0003-attributes-comparison.____patch
>>
>> +int
>> FunctionComparator::____cmpStrings(StringRef L,
>> StringRef R) const {
>> + // Prevent heavy comparison, compare
>> sizes first.
>> + if (int Res = cmpNumbers(L.size(),
>> R.size()))
>> + return Res;
>> +
>> + return L.compare(R);
>> +}
>>
>> Optional: I'm curious whether that's
>> actually worth it?
>> I checked
>> StringRef and it does the length comparison
>> last. It
>> could be that you
>> know that your strings tend to be longer
>> than StringRef
>> which tries to
>> be efficient in general.
>>
>> 0004-operations-comparison.____patch
>>
>> +int
>> FunctionComparator::____cmpOperation(const
>> Instruction *L,
>>
>> Optional: Again, a description of the
>> ordering would be
>> interesting.
>> Opcode number, then number of operands,
>> then type, then
>> 'nsw/nuw/exact/volatile'.
>>
>> 0005-GEP-comparison.patch
>>
>> +int FunctionComparator::cmpGEP(____const
>> GEPOperator
>> *GEPL,
>> + const
>> GEPOperator
>> *GEPR) {
>>
>> + unsigned int ASL =
>> GEPL->getPointerAddressSpace()____;
>>
>> Extra line after {
>>
>> if (DL) {
>> [...]
>> + unsigned BitWidth = DL ?
>> DL->getPointerSizeInBits(ASL) : 1;
>>
>> BitWidth doesn't need to test DL for
>> null-ness.
>>
>> 0006-top-level-compare-method.____patch
>>
>> Perfect!
>>
>> 0007-sanity-check.patch
>>
>> + // Check symmetricity.
>>
>> "symmetricity" -> "symmetry".
>>
>> 0008-removed-unused-methods.____patch
>>
>> Perfect
>>
>> 0009-FnSet-replaced-with-____FnTree.patch
>>
>> Okay, I'm confused. We do all this work to
>> produce a
>> 3-way comparison,
>> then we wrap it up in an operator< for a
>> std::set that
>> will only look at
>> less-than, but never uses the equals
>> result. We want to
>> do one
>> comparison between two functions, not two
>> (ie., std::set
>> does
>> func1<func2 and func2<func1 to see whether
>> they're equal
>> or less than,
>> and we don't need that since cmp(func1,
>> func2) tells us
>> directly). Maybe
>> this comes in a future patch?
>>
>> One other question, I should probably know
>> the answer to
>> this but it's
>> been a while. :) When we start detecting
>> equal
>> functions, do we merge
>> them immediately? If so, can't that cause
>> functions
>> already in the set
>> to have their equality comparisons change
>> (ie., A calls
>> X, B calls Y, A
>> and B are in the set, then we merge X and Y
>> for being
>> equal, now we've
>> changed A and B to both called the same
>> merged function,
>> so they start
>> to compare the same). Sets tend not to like
>> having their
>> comparison
>> functions change on elements that they
>> contain.
>>
>> 0010-updated-comments.patch
>>
>> Okay.
>>
>> 0011-removed-DenseMap-helpers.____patch
>>
>> Yay deleting dead code!
>>
>> Nick
>>
>>
>> Thanks!
>> -Stepan.
>>
>> Stepan Dyatkovskiy wrote:
>>
>> ping
>> Stepan Dyatkovskiy wrote:
>>
>> Hi Nick,
>>
>> Please find the fixed patches
>> in attachment.
>> Series starts from "constants
>> comparison".
>>
>> Below small report of what has
>> been fixed,
>> and answers on your
>> questions.
>>
>> cmpTypes:
>> > Please add a comment for
>> this method.
>> Include the meaning of the
>> returned value. ("man strcmp"
>> for
>> inspiration.)
>> Fixed and committed. So you can
>> look in
>> trunk, may be I forgot to do
>> something (hope not :-) )
>>
>> checkForLosslessBitcast,
>> cmpConstants:
>> > Why isn't this using cmpType?
>> Fixed.
>>
>> > Please put the else on the
>> same line as
>> the closing brace.
>> Fixed.
>>
>> >> + else if (const VectorType
>> *thisPTy =
>> dyn_cast<VectorType>(L))
>> > Missing initial capital
>> Sorry. Fixed. Actually these
>> typos has been
>> cloned from
>> Type::canLosslesslyBitCastTo.
>>
>> >> + return
>> cmpNumbers((uint64_t)L,
>> (uint64_t)R);
>> >>
>> > Please assert on unknown
>> constant.
>> That's possible. But what if we
>> really got
>> unknown constant? New
>> comming
>> constant types merely depends on
>> MergeFunctions implementation.
>> So we
>> get crash if it will happen.
>> And we loose
>> nothing comparing them
>> just as
>> pointers. So, do we really
>> need an
>> assertion? Currently I kept it
>> as it
>> was. If your answer is "yes, we
>> need it", it
>> would easy to add it:
>>
>> case Value::FunctionVal:
>> case Value::GlobalVariableVal:
>> - case Value::GlobalAliasVal:
>> - default: // Unknown constant
>> - return cmpNumbers((uint64_t)L,
>> (uint64_t)R);
>> + case Value::GlobalAliasVal:
>> + return cmpNumbers((uint64_t)L,
>> (uint64_t)R);
>> + default:
>> + llvm_unreachable("Unknown
>> constant.");
>>
>> About variable names: Var1,
>> Var2 vs
>> VarL,VarR.
>> I think its better to use L/R
>> concept. Easer
>> to understand what to
>> return (-1, or 1) when you see
>> L/R rather
>> than Var1/Var2.
>> Var1/Var2 has been kept for
>> functions that
>> are to be removed till the
>> end of patch series.
>> F1 and F2 names were also kept
>> till the "top
>> level compare method"
>> patch.
>>
>> > Alternatively, any reason
>> not to have
>> cmpPointers(const void *L,
>> const void *R)?
>> I could do it. I just wanted to
>> show that
>> pointers are compared
>> like a
>> numbers in some situations.
>> Actually we do
>> it in cases when we have
>> absolutely no idea of smarter
>> comparison.
>> And back to cmpPonters. Its
>> rather about
>> what intuition tells. If
>> pointers are equal as numbers,
>> could they be
>> different somehow? Could
>> they be non-castable to numbers
>> on some
>> platforms? The last one is
>> big
>> trouble for DenseMapInfo
>> implementation..
>> If there is any (even very
>> small)
>> possibility of such cases - then
>> yes,
>> I vote for cmpPointers. The
>> last word is up
>> to you anyway.
>>
>> Attributes (0005):
>> > Attributes already have
>> operator< and
>> operator==. Please reuse
>> them.
>> Fixed. I used simple "if":
>>
>> if (LA < RA)
>> return -1;
>> if (RA < LA)
>> return 1;
>>
>> Though its possible to use:
>>
>> if (int Res = (LA < RA) ? -1 :
>> (RA < LA) ? 1
>> : 0)
>> return Res;
>>
>> Which one is more preferable?
>>
>> cmpGEP (0007):
>> > + int cmpGEP(const
>> GEPOperator *GEP1,
>> const GEPOperator *GEP2);
>> > + int cmpGEP(const
>> GetElementPtrInst
>> *GEP1,
>> > + const GetElementPtrInst
>> *GEP2) {
>> >
>> > Const members?
>> We can't make it constant,
>> since it compares
>> values (cmpValues, aka
>> enumerate). So, if we see Value
>> first time,
>> we have to add it into
>> sn_mapL/R.
>>
>> > I think you should put this
>> under if (DL)
>> { /* declare Offset1,
>> Offset2, etc. */ }
>> Fixed.
>>
>> About 0008:
>> > Did you mean "cmpOperation"?
>> Yes. We could keep it in one
>> place. I have
>> fixed this case and
>> removed
>> TODO.
>>
>> > "compare" still returns
>> "bool". I'm going
>> to assume this was
>> meant to
>> go in 0009.
>> Yes.
>>
>> About 0011:
>> > std::set is frowned upon, see
>>
>> http://llvm.org/docs/____ProgrammersManual.html#set
>> <http://llvm.org/docs/__ProgrammersManual.html#set>
>>
>> <http://llvm.org/docs/__ProgrammersManual.html#set
>> <http://llvm.org/docs/ProgrammersManual.html#set>>
>> . Do you actually
>> rely
>> on the stable iterator
>> guarantee? Or would
>> another set-like container
>> be a
>> better fit?
>> Huh.. I have looked for
>> alternatives.
>> Unfortunately SmallSet is not
>> suitable, since we need to
>> lookup existing
>> items, and SmallSet::find
>> method is not present. May be I
>> have missed
>> something.
>> Actually we need binary tree
>> implementation.
>> Do we have better
>> analogue?
>> I could implement new one.
>> Think it should
>> be just one more patch
>> afterwards.
>>
>> >No. An identifier with
>> underscore then
>> capital letter is reserved
>> for
>> the implementation. You can
>> just call them
>> "F" and "DL", then the
>> ctor
>> initializer list can be "F(F),
>> DL(DL)" and
>> that will work fine.
>> Fixed.
>>
>> 0013:
>> > Sorry, I'm not sure what
>> this sentence
>> means? The thing your
>> example
>> is talking about is showing is
>> that two
>> functions in an SCC may be
>> equivalent, and detecting them
>> requires
>> "blinding" (ignoring) certain
>> details of the function when
>> you do the
>> comparison. We blind
>> ourselves
>> to the pointee types, but not
>> to callees of
>> functions in the same
>> SCC.
>>
>> I meant generalising of
>> cross-reference case:
>> in old implementation, while
>> comparing F and
>> G functions, we treat as
>> equal case when F uses G, with
>> case when G
>> uses F (at the same
>> place).
>> It could happen that F uses G1,
>> while G1
>> uses G2, while G2 uses F.
>> So it
>> still the same. But currently
>> we have to
>> invent way how to detect
>> such
>> cases.
>>
>> Thanks for reviews!
>> -Stepan
>>
>> Stepan Dyatkovskiy wrote:
>>
>> Hi Nick,
>> I have committed 0001 as
>> r203788.
>> I'm working on fixes for
>> 0002 - 0014.
>>
>> > After reading through
>> this patch
>> series, I feel like I'm
>> missing
>> > something important.
>> Where's the sort
>> function? It looks like
>> we're
>> > still comparing all
>> functions to all
>> other functions.
>> When you insert functions
>> into std::set
>> or its analo g s it does all
>> the
>> job for you. Since
>> internally it builds
>> a binary tree using "less"
>> comparison, and each
>> insert/look-up
>> operation takes O(log(N))
>> time.
>>
>> -Stepan.
>>
>> Nick Lewycky wrote:
>>
>> On 27 February 2014
>> 08:23, Stepan
>> Dyatkovskiy
>> <stpworld at narod.ru <mailto:stpworld at narod.ru>
>>
>> <mailto:stpworld at narod.ru <mailto:stpworld at narod.ru>>
>>
>> <mailto:stpworld at narod.ru <mailto:stpworld at narod.ru>
>>
>> <mailto:stpworld at narod.ru <mailto:stpworld at narod.ru>>>>
>> wrote:
>>
>> Hi Nick,
>>
>> I tried to rework
>> changes as you
>> requested. One of
>> patches (0004
>> with extra assertions)
>> has been
>> removed.
>>
>>
>> > + bool
>> isEquivalentType(Type
>> *Ty1, Type *Ty2) const {
>> > + return
>> cmpType(Ty1, Ty2) == 0;
>> > + }
>> >
>> > Why do we still need
>> isEquivalentType? Can
>> we nuke this?
>> Yup. After applying all
>> the patches
>> isEquivalentType will be
>> totally
>> replaced with cmpType.
>> All isEqXXXX
>> and friends will be
>> removed in
>> 0011 (old 0012). No
>> traces left.
>> Old function wasn't
>> removed in 0001
>> just for keeping patches
>> without
>> extra noise like:
>>
>> - something that uses
>> isEquivalentType
>> + something that uses
>> cmpType
>>
>> The point is, that
>> "something" that
>> uses isEquivalentType,
>> will be
>> also replaced with one
>> of next
>> patches in this series.
>>
>>
>> >
>> > +static int
>> cmpNumbers(uint64_t
>> L, uint64_t R) {
>> > + if (L < R)
>> return -1;
>> > + if (L > R)
>> return 1;
>> > + return 0;
>> > +}
>> >
>> > At a high level, you
>> don't
>> actually need a <=>
>> operator to use a
>> sort. A
>> > strict ordering ( <
>> operator) is
>> sufficient. (Note
>> that for
>> mergefunc, a
>> > strict weak ordering
>> is not
>> sufficient, it must be
>> a total
>> ordering.)
>> That could be done
>> with int
>>
>> FunctionComparator::compare(). We can
>> replace it with bool
>>
>> FunctionComparator::less(). Though
>> for all
>> other cmp methods need
>> at least 2
>> bits of information as
>> result:
>> 1. Whether things are
>> equal.
>> 2. Whether left less
>> than right.
>>
>> As for
>>
>> FunctionComparator::compare(),
>> conversion into
>> less() will
>> increase time of sanity
>> check (patch
>> #0010).
>> Sanity check is just a
>> sweet bonus.
>> It checks that ordering
>> implemented properly
>> (checks order
>> relation properties).
>> Turning compare() into
>> less() mean,
>> that we'll have to run
>> comparison two times:
>> L.less(R) and
>> R.less(L). But may be
>> sanity
>> check is not a thing to
>> be published
>> at all.
>>
>>
>> >
>> > Consider hoisting
>> this inside the
>> FunctionComparator
>> class? That
>> class
>> > should have a
>> bunch of
>> implementations of
>> comparisons
>> between
>> various
>> > different things,
>> which can pass
>> down to other methods
>> in the
>> same class.
>> In new patch series
>> attached to this
>> post, I have moved all
>> static
>> methods into
>> FunctionComparator.
>>
>>
>> > + // Replacement for
>>
>> type::canLosslesslyBitCastTo, that
>> > + // establish order
>> relation on
>> this kind of properties.
>> > + int
>>
>> checkForLosslessBitcast(const Type
>> *L, const Type *R);
>> >
>> > Type:: not type:: .
>> Please make
>> this comment more
>> descriptive.
>> Done.
>> [new comment]
>> Replacement for
>>
>> Type::canLosslesslyBitCastTo, that
>>
>> establish order
>> relation on this
>> kind of properties
>> Returns 0, if L and R
>> types could be
>> converted to each other
>> without
>> reinterpretation of
>> bits.
>> Otherwise method
>> returns -1 or 1,
>> defining total ordering
>> between
>> types in context of
>> lossless
>> bitcastability trait.
>> E.g.: if L is less than
>> R (result is
>> -1), than every type
>> that
>> could be
>> losslessly bitcasted to
>> L is less
>> than R.
>> [/new comment]
>>
>>
>> >
>> > + /// Replacement
>> for C1 ==
>> getBitCast(C2, C1Ty)
>> > + /// Its more
>> controllable, and
>> supposed to be
>> simpler and
>> more
>> > predictionable.
>> > + /// As very
>> important
>> advantage: it allows to
>> introduce
>> order
>> relation on
>> > + /// constants set,
>> and thus use
>> it as trait in
>> refinement
>> routines.
>> >
>> > "Its" --> "It's".
>> "predictionable" -->
>> "predictable".
>> And how is
>> it more
>> > predictable? I think
>> this comment
>> would be better if it
>> described the
>> > function instead of
>> making
>> comparisons between it
>> and other
>> functions.
>> > Something like,
>> "Compare
>> constants under a
>> system where
>> pointer
>> to X and
>> > pointer to Y are
>> considered
>> equal" or whatever is
>> actually true
>> here.
>> Done.
>> [new comment]
>>
>> Replacement for C1 ==
>> getBitCast(C2,
>> C1Ty)
>> Parses constants
>> contents, assuming
>> that types are
>> losslessly
>> bitcasted between each
>> other. So
>> actually it ignores
>> types and only
>> compares bits from L
>> and R.
>> Returns 0, if L and R
>> has equivalent
>> content.
>> -1 or 1 if values are
>> different.
>> Maintaining total
>> ordering
>> requires
>> two values that
>> indicates
>> non-equivalence (L less
>> R, L
>> greater R).
>> [/new comment]
>>
>>
>> >
>> > +enum ConstantType {
>> > I'm not sure that
>> this buys you
>> much. All the "isa"
>> tests can be
>> broken
>> > down into a switch on
>> getValueID() with the
>> one
>> exception of
>> isNullValue().
>> Done.
>>
>>
>> > + assert(
>> > +
>>
>>
>> C1->getType()->______canLosslesslyBitCastTo(C2->______getType())
>> &&
>> > + "Pass is
>> healthless.
>> checkForLosslessBitcast
>> should be
>> twin of "
>> > +
>> "canLosslesslyBitCastTo method,
>> except the case when the
>> last one "
>> > + "returns false,
>> the first one
>> should return -1 or 1");
>> ...
>>
>> > I think we can skip
>> the asserts
>> here. They aren't
>> detecting a
>> specific
>> > bug, they're
>> checking whether the
>> new code does a certain
>> task
>> relative
>> > to the old code.
>> Drop the old
>> code, your new code is
>> the new
>> sheriff in
>> > town.
>> Asserts has been
>> removed.
>>
>>
>> >
>> > + DenseMap<const
>> Value*, int>
>> sn_map1, sn_map2;
>> >
>> > What is sn short for
>> ("seen")?
>> Why are there two of
>> these?
>> Serial number :-)
>>
>> >
>> > +
>> std::pair<DenseMap<const Value
>> *, int>::iterator, bool>
>> > + LeftSN =
>>
>> sn_map1.insert(std::make_pair(______V1,
>> sn_map1.size())),
>> > + RightSN =
>>
>> sn_map2.insert(std::make_pair(______V2,
>> sn_map2.size()));
>> >
>> > So I think I get it,
>> this is easy
>> to reason about. You
>> number
>> each value
>> > going down both
>> functions. But I
>> think you can eliminate
>> one of
>> these
>> > maps because you
>> know that if the
>> left and right were ever
>> different
>> > then we terminate
>> the comparison
>> immediately. You can at
>> least
>> share one
>> > map with both V1 and
>> V2, but I
>> think you can reduce
>> the number
>> of map
>> > insertions.
>> Not sure. Consider that
>> in left you
>> have:
>> %A = alloca i32
>> %B = alloca i32
>> store i32 1, i32* %A
>>
>> And in right:
>> %A = alloca i32
>> %B = alloca i32
>> store i32 1, i32* %B
>>
>> When we meet 'store'
>> instruction we
>> need to check in which
>> order %A
>> was allocated at left
>> and in which
>> order %B was allocated
>> at right.
>> So for each value in
>> function we
>> have to assign serial
>> number. Then
>> comparing to local
>> values from
>> different functions we
>> can just
>> check
>> their serial numbers.
>> Though, may be I missed
>> something?
>> May be there is another
>> way to
>> determine "who was
>> first".
>>
>>
>> > High-level
>> question, are
>> attributes ordered? Your
>> comparison is
>> ordered.
>> > So if I have
>> function F with
>> string attributes
>> "sse2",
>> "gc=hemisphere"
>> > and another function
>> G with
>> string attributes
>> "gc=hemisphere",
>> "sse2"
>> > then they would be
>> considered
>>
>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0006-top-level-comparisons.patch
Type: text/x-diff
Size: 9942 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140605/e8b7d2d0/attachment.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0007-sanity-check.patch
Type: text/x-diff
Size: 3933 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140605/e8b7d2d0/attachment-0001.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0008-removed-unused-methods.patch
Type: text/x-diff
Size: 2231 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140605/e8b7d2d0/attachment-0002.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0009-FnSet-replaced-with-FnTree.patch
Type: text/x-diff
Size: 6834 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140605/e8b7d2d0/attachment-0003.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0010-updated-comments.patch
Type: text/x-diff
Size: 3246 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140605/e8b7d2d0/attachment-0004.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0011-removed-DenseMap-helpers.patch
Type: text/x-diff
Size: 4629 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140605/e8b7d2d0/attachment-0005.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: apply-patches.sh
Type: application/x-sh
Size: 355 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140605/e8b7d2d0/attachment.sh>
More information about the llvm-commits
mailing list