<div dir="ltr"><div class="gmail_extra">0003:</div><div class="gmail_extra"><br></div><div class="gmail_extra">+  // Mostly cloned from BitcodeWriter, but simplified a bit.<br></div><div class="gmail_extra"><br></div><div class="gmail_extra">

Please remove that comment, it isn't adding anything for the reader.</div><div class="gmail_extra"><br></div><div class="gmail_extra">And submit.</div><div class="gmail_extra"><br></div><div class="gmail_extra">0004:</div>

<div class="gmail_extra"><br></div><div class="gmail_extra">I *really* like the comment block!! This is going to save developers so much time!</div><div class="gmail_extra"><br></div><div class="gmail_extra">No changes, please submit!</div>

<div class="gmail_extra"><br></div><div class="gmail_extra">0005:</div><div class="gmail_extra"><br></div><div class="gmail_extra">Perfect. Please submit.</div><div class="gmail_extra"><br></div><div class="gmail_extra">
0006:</div>
<div class="gmail_extra"><br></div><div class="gmail_extra">+        // TODO: Already checked in cmpOps<br></div><div class="gmail_extra"><br></div><div class="gmail_extra">cmpOps?</div><div class="gmail_extra"><br></div>

<div class="gmail_extra">Nick</div><div class="gmail_extra"><div><br></div><div class="gmail_quote">On 15 May 2014 06:32, Stepan Dyatkovskiy <span dir="ltr"><<a href="mailto:stpworld@narod.ru" target="_blank">stpworld@narod.ru</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Hi Nick,<br>
What about rest of patches? I have fixed 0004 and 0005, others seems to be OK for you.. Can I commit them?<br>
<br>
Thanks!<br>
-Stepan<br>
<br>
Stepan Dyatkovskiy wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
Hi Nick,<br>
Thanks you for reviews. I have corrected for of comments from FAQ to<br>
what seems to be regular in LLVM.<br>
Patches has been committed as r208173 and r208189.<br>
<br>
Thanks!<br>
-Stepan<br>
<br>
Nick Lewycky wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
On 29 April 2014 04:36, Stepan Dyatkovskiy <<a href="mailto:stpworld@narod.ru" target="_blank">stpworld@narod.ru</a><br>
<mailto:<a href="mailto:stpworld@narod.ru" target="_blank">stpworld@narod.ru</a>>> wrote:<br>
<br>
    Hi Nick.<br>
<br>
    I have fixed and reattached all the patches from 0001 till 0011.<br>
<br>
    0001, 0002:<br>
    I have added detailed order descriptions to the methods declaration.<br>
    I have also fixed everything as you asked.<br>
<br>
<br>
0001:<br>
<br>
+    for (uint64_t i = 0; i < NumElementsL; ++i) {<br>
vs.<br>
+    for (unsigned i = 0, e = NumElementsL; i != e; ++i) {<br>
<br>
This difference is necessary to avoid a warning? I don't actually care<br>
which form you use here given that NumElementsL is already an integer<br>
there's no recalculation we're avoiding like getting the end iterator<br>
each time, I just want it to look consistent. If it's working around a<br>
warning that's more important than consistent looks.<br>
<br>
0001 looks good. Please commit it.<br>
<br>
0002:<br>
<br>
One note, you use a writing style of asking a question, followed by<br>
answering it. For instance, "Will we still get the same numbering? As<br>
follows from FunctionComparator::compare(), we do CFG walk, ...". I<br>
don't think it's worth your time to rewrite it now, but we generally<br>
don't use that construction in llvm's comments, instead preferring to<br>
explain what the code does an including an argument that explains why<br>
the approach is correct. This amounts to the same thing, it's just a<br>
different style.<br>
<br>
+  ///          their functions. If right value was meet first during<br>
scanning,<br>
<br>
"was meet" --> "was met"<br>
<br>
Please commit it.<br>
<br>
Nick<br>
<br>
<br>
    0003, cmpStrings<br>
    StringRef::compare performs lexicographical comparison, and it isn't<br>
    the fastest one.<br>
    In case, when strings has different length, it still compares their<br>
    prefixes lexicographically.<br>
    We need ordering among strings set, but we don't need to compare<br>
    strings contents lexicographically all the time. So we can compare<br>
    their sizes first, and do lexicographical comparison only when we<br>
    really need it: only when strings are equal in size.<br>
<br>
    I have added short version of these explanations into cmpStrings. So<br>
    we could get really well documented methods based on our posts ;-)<br>
    Those were good questions and definitely the answers should be in<br>
    documentation somewhere.<br>
<br>
    0004: I have added ordering description.<br>
    0005: Fixed.<br>
    0007: Fixed.<br>
    0009:<br>
<br>
     > Okay, I'm confused. We do all this work to produce a 3-way<br>
    comparison,<br>
     > then we wrap it up in an operator< for a std::set that will only<br>
    look at<br>
     > less-than...<br>
    Actually, on top level we don't need 3-way comparison. But we do<br>
    need it to determine whether functions are equal on some stage, just<br>
    to proceed to the next "less significant" stage.<br>
    Consider two functions (where BB1 and BB11 are equal)<br>
<br>
    "F1: {BB1, BB2}" and<br>
    "F2: {BB11, BB3}".<br>
<br>
    First we have to realize that BB1 and BB11 are equal. And then<br>
    determine which one of BB2 and BB3 is less than another.<br>
    So finally we got 3-way comparison: less, equal and greater (in fact<br>
    just neither less nor equal).<br>
<br>
<br>
     >  When we start detecting equal functions, do we merge them<br>
    immediately? If so, can't that cause functions already in the set to<br>
    have their equality comparisons change<br>
    Yes we merge them immediately. And no, it won't cause situation you<br>
    have mentioned. Once we changed function X, we lookup for functions<br>
    that uses X, remove them from FnTree (former FnSet) and put them<br>
    into Deferred list. Then we fill up worklist with Deferred contents<br>
    and repeat comparison routine for these functions.<br>
<br>
<br>
    Thanks for reviews!<br>
<br>
    -Stepan<br>
<br>
    Stepan Dyatkovskiy wrote:<br>
<br>
        ping<br>
        Stepan Dyatkovskiy wrote:<br>
<br>
            Hi Nick,<br>
            Thank you for reviews! Here is my answers for first two<br>
patches.<br>
<br>
            Though as additional, I think its important answer to this<br>
            question<br>
            first:<br>
              > Okay, I'm confused. We do all this work to produce a<br>
            3-way comparison,<br>
              > then we wrap it up in an operator< for a std::set that<br>
            will only<br>
            look at<br>
              > less-than...<br>
            Actually, on top level we don't need 3-way comparison. But<br>
            we do need it<br>
            to determine whether functions are equal on some stage just<br>
            to proceed<br>
            to the next "less meaningful" stage.<br>
            Consider two functions<br>
            "F1: {BB1, BB2}" and<br>
            "F2: {BB-equal-to-BB1, BB3}".<br>
            First we have to realize that BB1 and BB-equal-to-BB1 are<br>
            equal. And<br>
            then determine which one of BB2 and BB3 is less.<br>
<br>
            First patch.<br>
<br>
              > +  /// 1. Bitcastability.<br>
              > +  /// Check whether L's type could be losslessly<br>
            bitcasted to R's<br>
            type.<br>
              > +  /// On this stage method, in case when lossless<br>
            bitcast is not<br>
            possible<br>
              > +  /// method returns -1 or 1, thus also defining which<br>
            type is<br>
            greater in<br>
              > +  /// context of bitcastability.<br>
            I have added detailed order description for this method. You<br>
            also can<br>
            find this description in cmpConstantsComment.txt.<br>
<br>
              > Optional: "Return its pointers comparison result" -><br>
            "Return the<br>
            result<br>
              > of comparing the types"?<br>
            Oops! Fixed.<br>
<br>
              > +    if (TyL->getTypeID() == Type::X86_MMXTyID)<br>
              > +      TyLWidth = 64;<br>
              > +    else if (const VectorType *VecTyL =<br>
            dyn_cast<VectorType>(TyL))<br>
              > +      TyLWidth = VecTyL->getBitWidth();<br>
              ><br>
              > Yikes! We never merge function A using a vector and<br>
            function B<br>
            using an<br>
              > x86_mmx type together, right?<br>
            Fixed!<br>
<br>
              > "Nor of TyL and TyR" -> "Neither Tyl nor TyR".<br>
              ><br>
              > Optional: "Return its pointers comparison result" -><br>
            "Return the<br>
            result<br>
              > of comparing the types"?<br>
            Fixed.<br>
<br>
              > +  case Value::UndefValueVal: return<br>
cmpType(L->getType(),<br>
            R->getType());<br>
              ><br>
              > You already have L->getType() and R->getType() in TyL<br>
            and TyR.<br>
            Fixed. Sorry.<br>
<br>
            Second patch.<br>
<br>
              > I'm curious whether we guarantee that<br>
              > we get the same numbering when comparing two functions<br>
            if I simply<br>
              > reorder the BasicBlocks in one of the functions, will we<br>
            still assign<br>
              > the same numberings or not?<br>
            ...<br>
              > Why not just start at the beginning and run to<br>
              > the end, relying on the SSA dominance property.<br>
            Good questions!<br>
<br>
            For first one I have added these notes:<br>
            [quote]<br>
            As follows from FunctionComparator::compare(), we do CFG<br>
            walk, we start<br>
            from entry, and then take each terminator. So it doesn't<br>
            matter how in<br>
            fact BBs are ordered in function. And since cmpValues are<br>
            called during<br>
            this walk, the numbering depends only on how BBs located<br>
            inside the CFG.<br>
            [/quote]<br>
<br>
            As for second question, I have added next notes:<br>
<br>
            [quote]<br>
            We also can't use dominance properties. Consider situation:<br>
            If we compare two instruction operands: first is usage of<br>
            local variable<br>
            AL from function FL, and second is usage of local variable<br>
            AR from FR.<br>
            We are still not able to compare operands of PHI, since<br>
            those could be<br>
            operands from further BBs we didn't scan yet.<br>
            [/quote]<br>
<br>
            You can find comments also in sn_mapLRComment.txt<br>
            I have also described lexicographical ordering for cmpValues<br>
            in method<br>
            comments. I have also put them in separated txt file:<br>
            cmpValuesComment.txt.<br>
<br>
              > +    if (L == R) return 0;<br>
              > +    // Compare C1 and the bitcast result.<br>
              > +    if (int Res = cmpConstants(ConstL, ConstR))<br>
              > +      return Res;<br>
              > [...]<br>
              > +    return 0;<br>
              ><br>
              > Isn't that just:<br>
              >    return cmpConstants(ConstL, ConstR);<br>
            OOOPS! Fixed.<br>
<br>
            I'm working on rest of patches.<br>
            Thanks!<br>
<br>
            -Stepan.<br>
<br>
            Nick Lewycky wrote:<br>
<br>
                Stepan Dyatkovskiy wrote:<br>
<br>
                    Hello Nick!<br>
<br>
                    Can we proceed to work on patch series?<br>
<br>
<br>
                    Just in case, I have reattached them to this post.<br>
                    Would you prefer to see them in<br>
                    <a href="http://reviews.llvm.org" target="_blank">http://reviews.llvm.org</a> ?<br>
<br>
<br>
                Oddly, I prefer attachments, but we can use<br>
                <a href="http://reviews.llvm.org" target="_blank">reviews.llvm.org</a> <<a href="http://reviews.llvm.org" target="_blank">http://reviews.llvm.org</a>> if you<br>
                prefer?<br>
<br>
                0001-constants-comparison.__<u></u>patch<br>
<br>
                +  /// 1. Bitcastability.<br>
                +  /// Check whether L's type could be losslessly<br>
                bitcasted to R's type.<br>
                +  /// On this stage method, in case when lossless<br>
                bitcast is not<br>
                possible<br>
                +  /// method returns -1 or 1, thus also defining which<br>
                type is<br>
                greater in<br>
                +  /// context of bitcastability.<br>
<br>
                Looking at the code, it looks like the order is:<br>
                    1. first class types, by typeid<br>
                    2. vectors (by element size? by number of elements?<br>
                the two<br>
                multiplied together?)<br>
<br>
                +      // Nor of TyL and TyR are values of first class<br>
                type. Return<br>
                +      // its pointers comparison result.<br>
<br>
                "Nor of TyL and TyR" -> "Neither Tyl nor TyR".<br>
<br>
                Optional: "Return its pointers comparison result" -><br>
                "Return the result<br>
                of comparing the types"?<br>
<br>
                +    if (TyL->getTypeID() == Type::X86_MMXTyID)<br>
                +      TyLWidth = 64;<br>
                +    else if (const VectorType *VecTyL =<br>
                dyn_cast<VectorType>(TyL))<br>
                +      TyLWidth = VecTyL->getBitWidth();<br>
<br>
                Yikes! We never merge function A using a vector and<br>
                function B using an<br>
                x86_mmx type together, right? Note what the LangRef says<br>
                about x86_mmx:<br>
                "The operations allowed on it are quite limited:<br>
                parameters and return<br>
                values, load and store, and bitcast."<br>
<br>
                +  case Value::UndefValueVal: return<br>
cmpType(L->getType(),<br>
                R->getType());<br>
<br>
                You already have L->getType() and R->getType() in TyL<br>
                and TyR. You also<br>
                already have cmpType(TyL, TyR) in TypesRes. Please just<br>
                reuse them in<br>
                the rest of this function.<br>
<br>
<br>
0002-values-comparison-__<u></u>replacement-for-enumerate.__<u></u>patch<br>
<br>
                +  // Assign serial numbers to values from left<br>
                function, and values<br>
                from<br>
                +  // right function.<br>
                +  // Explanation:<br>
                +  // Being comparing functions we need to compare<br>
                values we meet at<br>
                left and<br>
                +  // right sides.<br>
                +  // Its easy to sort things out for external values.<br>
                It just should be<br>
                +  // the same value at left and right.<br>
                +  // But for local values (those were introduced inside<br>
                function body)<br>
                +  // we have to ensure they were introduced at exactly<br>
                the same place,<br>
                +  // and plays the same role. How to do it?<br>
                +  // Just assign serial number to each value when we<br>
                meet it first<br>
                time.<br>
                +  // Values that were met at same place will be with<br>
                same serial<br>
                numbers.<br>
<br>
                This explanation is correct, but I'm curious whether we<br>
                guarantee that<br>
                we get the same numbering when comparing two functions<br>
                if I simply<br>
                reorder the BasicBlocks in one of the functions, will we<br>
                still assign<br>
                the same numberings or not?<br>
<br>
                Also, it's not clear from the comment why we need the<br>
                complexity of<br>
                numbering them anyways. Why not just start at the<br>
                beginning and run to<br>
                the end, relying on the SSA dominance property to<br>
                guarantee that we<br>
                visit definitions before uses, and assume we'll visit<br>
                each value in the<br>
                same order (look at each instruction and operand in<br>
                order and just<br>
                complain about differences)? The answer is that PHI<br>
                nodes mean we will<br>
                visit some uses before some definitions.<br>
<br>
                Anyways, changing this text is optional in my mind. I<br>
                think it could be<br>
                better, but it's not worth spending much time on it.<br>
<br>
                +int FunctionComparator::cmpValues(<u></u>__const Value *L,<br>
                const Value *R) {<br>
<br>
                I wouldn't mind an explanation of the lexographic<br>
                ordering on this one<br>
                either. Constants before non-constants, then InlineAsm<br>
                (ordered by<br>
                address), then visit order.<br>
<br>
                +    if (L == R) return 0;<br>
                +    // Compare C1 and the bitcast result.<br>
                +    if (int Res = cmpConstants(ConstL, ConstR))<br>
                +      return Res;<br>
                [...]<br>
                +    return 0;<br>
<br>
                Isn't that just:<br>
                    return cmpConstants(ConstL, ConstR);<br>
                ?<br>
<br>
                0003-attributes-comparison.__<u></u>patch<br>
<br>
                +int FunctionComparator::__<u></u>cmpStrings(StringRef L,<br>
                StringRef R) const {<br>
                +  // Prevent heavy comparison, compare sizes first.<br>
                +  if (int Res = cmpNumbers(L.size(), R.size()))<br>
                +    return Res;<br>
                +<br>
                +  return L.compare(R);<br>
                +}<br>
<br>
                Optional: I'm curious whether that's actually worth it?<br>
                I checked<br>
                StringRef and it does the length comparison last. It<br>
                could be that you<br>
                know that your strings tend to be longer than StringRef<br>
                which tries to<br>
                be efficient in general.<br>
<br>
                0004-operations-comparison.__<u></u>patch<br>
<br>
                +int FunctionComparator::__<u></u>cmpOperation(const<br>
                Instruction *L,<br>
<br>
                Optional: Again, a description of the ordering would be<br>
                interesting.<br>
                Opcode number, then number of operands, then type, then<br>
                'nsw/nuw/exact/volatile'.<br>
<br>
                0005-GEP-comparison.patch<br>
<br>
                +int FunctionComparator::cmpGEP(__<u></u>const GEPOperator<br>
*GEPL,<br>
                +                               const GEPOperator<br>
*GEPR) {<br>
<br>
                +  unsigned int ASL = GEPL->getPointerAddressSpace()<u></u>__;<br>
<br>
                Extra line after {<br>
<br>
                     if (DL) {<br>
                [...]<br>
                +    unsigned BitWidth = DL ?<br>
                DL->getPointerSizeInBits(ASL) : 1;<br>
<br>
                BitWidth doesn't need to test DL for null-ness.<br>
<br>
                0006-top-level-compare-method.<u></u>__patch<br>
<br>
                Perfect!<br>
<br>
                0007-sanity-check.patch<br>
<br>
                +          // Check symmetricity.<br>
<br>
                "symmetricity" -> "symmetry".<br>
<br>
                0008-removed-unused-methods.__<u></u>patch<br>
<br>
                Perfect<br>
<br>
                0009-FnSet-replaced-with-__<u></u>FnTree.patch<br>
<br>
                Okay, I'm confused. We do all this work to produce a<br>
                3-way comparison,<br>
                then we wrap it up in an operator< for a std::set that<br>
                will only look at<br>
                less-than, but never uses the equals result. We want to<br>
                do one<br>
                comparison between two functions, not two (ie., std::set<br>
                does<br>
                func1<func2 and func2<func1 to see whether they're equal<br>
                or less than,<br>
                and we don't need that since cmp(func1, func2) tells us<br>
                directly). Maybe<br>
                this comes in a future patch?<br>
<br>
                One other question, I should probably know the answer to<br>
                this but it's<br>
                been a while. :) When we start detecting equal<br>
                functions, do we merge<br>
                them immediately? If so, can't that cause functions<br>
                already in the set<br>
                to have their equality comparisons change (ie., A calls<br>
                X, B calls Y, A<br>
                and B are in the set, then we merge X and Y for being<br>
                equal, now we've<br>
                changed A and B to both called the same merged function,<br>
                so they start<br>
                to compare the same). Sets tend not to like having their<br>
                comparison<br>
                functions change on elements that they contain.<br>
<br>
                0010-updated-comments.patch<br>
<br>
                Okay.<br>
<br>
                0011-removed-DenseMap-helpers.<u></u>__patch<br>
<br>
                Yay deleting dead code!<br>
<br>
                Nick<br>
<br>
<br>
                    Thanks!<br>
                    -Stepan.<br>
<br>
                    Stepan Dyatkovskiy wrote:<br>
<br>
                        ping<br>
                        Stepan Dyatkovskiy wrote:<br>
<br>
                            Hi Nick,<br>
<br>
                            Please find the fixed patches in attachment.<br>
                            Series starts from "constants comparison".<br>
<br>
                            Below small report of what has been fixed,<br>
                            and answers on your<br>
                            questions.<br>
<br>
                            cmpTypes:<br>
                             > Please add a comment for this method.<br>
                            Include the meaning of the<br>
                            returned value. ("man strcmp" for<br>
inspiration.)<br>
                            Fixed and committed. So you can look in<br>
                            trunk, may be I forgot to do<br>
                            something (hope not :-) )<br>
<br>
                            checkForLosslessBitcast, cmpConstants:<br>
                             > Why isn't this using cmpType?<br>
                            Fixed.<br>
<br>
                             > Please put the else on the same line as<br>
                            the closing brace.<br>
                            Fixed.<br>
<br>
                             >> + else if (const VectorType *thisPTy =<br>
                            dyn_cast<VectorType>(L))<br>
                             > Missing initial capital<br>
                            Sorry. Fixed. Actually these typos has been<br>
                            cloned from<br>
                            Type::canLosslesslyBitCastTo.<br>
<br>
                             >> + return cmpNumbers((uint64_t)L,<br>
                            (uint64_t)R);<br>
                             >><br>
                             > Please assert on unknown constant.<br>
                            That's possible. But what if we really got<br>
                            unknown constant? New<br>
                            comming<br>
                            constant types merely depends on<br>
                            MergeFunctions implementation. So we<br>
                            get crash if it will happen. And we loose<br>
                            nothing comparing them<br>
                            just as<br>
                            pointers. So, do we really need an<br>
                            assertion? Currently I kept it<br>
                            as it<br>
                            was. If your answer is "yes, we need it", it<br>
                            would easy to add it:<br>
<br>
                            case Value::FunctionVal:<br>
                            case Value::GlobalVariableVal:<br>
                            - case Value::GlobalAliasVal:<br>
                            - default: // Unknown constant<br>
                            - return cmpNumbers((uint64_t)L,<br>
(uint64_t)R);<br>
                            + case Value::GlobalAliasVal:<br>
                            + return cmpNumbers((uint64_t)L,<br>
(uint64_t)R);<br>
                            + default:<br>
                            + llvm_unreachable("Unknown constant.");<br>
<br>
                            About variable names: Var1, Var2 vs<br>
VarL,VarR.<br>
                            I think its better to use L/R concept. Easer<br>
                            to understand what to<br>
                            return (-1, or 1) when you see L/R rather<br>
                            than Var1/Var2.<br>
                            Var1/Var2 has been kept for functions that<br>
                            are to be removed till the<br>
                            end of patch series.<br>
                            F1 and F2 names were also kept till the "top<br>
                            level compare method"<br>
                            patch.<br>
<br>
                             > Alternatively, any reason not to have<br>
                            cmpPointers(const void *L,<br>
                            const void *R)?<br>
                            I could do it. I just wanted to show that<br>
                            pointers are compared<br>
                            like a<br>
                            numbers in some situations. Actually we do<br>
                            it in cases when we have<br>
                            absolutely no idea of smarter comparison.<br>
                            And back to cmpPonters. Its rather about<br>
                            what intuition tells. If<br>
                            pointers are equal as numbers, could they be<br>
                            different somehow? Could<br>
                            they be non-castable to numbers on some<br>
                            platforms? The last one is<br>
                            big<br>
                            trouble for DenseMapInfo implementation..<br>
                            If there is any (even very small)<br>
                            possibility of such cases - then<br>
                            yes,<br>
                            I vote for cmpPointers. The last word is up<br>
                            to you anyway.<br>
<br>
                            Attributes (0005):<br>
                             > Attributes already have operator< and<br>
                            operator==. Please reuse<br>
                            them.<br>
                            Fixed. I used simple "if":<br>
<br>
                            if (LA < RA)<br>
                            return -1;<br>
                            if (RA < LA)<br>
                            return 1;<br>
<br>
                            Though its possible to use:<br>
<br>
                            if (int Res = (LA < RA) ? -1 : (RA < LA) ? 1<br>
                            : 0)<br>
                            return Res;<br>
<br>
                            Which one is more preferable?<br>
<br>
                            cmpGEP (0007):<br>
                             > + int cmpGEP(const GEPOperator *GEP1,<br>
                            const GEPOperator *GEP2);<br>
                             > + int cmpGEP(const GetElementPtrInst<br>
*GEP1,<br>
                             > + const GetElementPtrInst *GEP2) {<br>
                             ><br>
                             > Const members?<br>
                            We can't make it constant, since it compares<br>
                            values (cmpValues, aka<br>
                            enumerate). So, if we see Value first time,<br>
                            we have to add it into<br>
                            sn_mapL/R.<br>
<br>
                             > I think you should put this under if (DL)<br>
                            { /* declare Offset1,<br>
                            Offset2, etc. */ }<br>
                            Fixed.<br>
<br>
                            About 0008:<br>
                             > Did you mean "cmpOperation"?<br>
                            Yes. We could keep it in one place. I have<br>
                            fixed this case and<br>
                            removed<br>
                            TODO.<br>
<br>
                             > "compare" still returns "bool". I'm going<br>
                            to assume this was<br>
                            meant to<br>
                            go in 0009.<br>
                            Yes.<br>
<br>
                            About 0011:<br>
                             > std::set is frowned upon, see<br>
<br>
<a href="http://llvm.org/docs/__ProgrammersManual.html#set" target="_blank">http://llvm.org/docs/__<u></u>ProgrammersManual.html#set</a><br>
<br>
<<a href="http://llvm.org/docs/ProgrammersManual.html#set" target="_blank">http://llvm.org/docs/<u></u>ProgrammersManual.html#set</a>><br>
                            . Do you actually<br>
                            rely<br>
                            on the stable iterator guarantee? Or would<br>
                            another set-like container<br>
                            be a<br>
                            better fit?<br>
                            Huh.. I have looked for alternatives.<br>
                            Unfortunately SmallSet is not<br>
                            suitable, since we need to lookup existing<br>
                            items, and SmallSet::find<br>
                            method is not present. May be I have missed<br>
                            something.<br>
                            Actually we need binary tree implementation.<br>
                            Do we have better<br>
                            analogue?<br>
                            I could implement new one. Think it should<br>
                            be just one more patch<br>
                            afterwards.<br>
<br>
                             >No. An identifier with underscore then<br>
                            capital letter is reserved<br>
                            for<br>
                            the implementation. You can just call them<br>
                            "F" and "DL", then the<br>
                            ctor<br>
                            initializer list can be "F(F), DL(DL)" and<br>
                            that will work fine.<br>
                            Fixed.<br>
<br>
                            0013:<br>
                             > Sorry, I'm not sure what this sentence<br>
                            means? The thing your<br>
                            example<br>
                            is talking about is showing is that two<br>
                            functions in an SCC may be<br>
                            equivalent, and detecting them requires<br>
                            "blinding" (ignoring) certain<br>
                            details of the function when you do the<br>
                            comparison. We blind<br>
                            ourselves<br>
                            to the pointee types, but not to callees of<br>
                            functions in the same<br>
                            SCC.<br>
<br>
                            I meant generalising of cross-reference case:<br>
                            in old implementation, while comparing F and<br>
                            G functions, we treat as<br>
                            equal case when F uses G, with case when G<br>
                            uses F (at the same<br>
                            place).<br>
                            It could happen that F uses G1, while G1<br>
                            uses G2, while G2 uses F.<br>
                            So it<br>
                            still the same. But currently we have to<br>
                            invent way how to detect<br>
                            such<br>
                            cases.<br>
<br>
                            Thanks for reviews!<br>
                            -Stepan<br>
<br>
                            Stepan Dyatkovskiy wrote:<br>
<br>
                                Hi Nick,<br>
                                I have committed 0001 as r203788.<br>
                                I'm working on fixes for 0002 - 0014.<br>
<br>
                                 > After reading through this patch<br>
                                series, I feel like I'm missing<br>
                                 > something important. Where's the sort<br>
                                function? It looks like<br>
                                we're<br>
                                 > still comparing all functions to all<br>
                                other functions.<br>
                                When you insert functions into std::set<br>
                                or its analo g s it does all<br>
                                the<br>
                                job for you. Since internally it builds<br>
                                a binary tree using "less"<br>
                                comparison, and each insert/look-up<br>
                                operation takes O(log(N)) time.<br>
<br>
                                -Stepan.<br>
<br>
                                Nick Lewycky wrote:<br>
<br>
                                    On 27 February 2014 08:23, Stepan<br>
                                    Dyatkovskiy <<a href="mailto:stpworld@narod.ru" target="_blank">stpworld@narod.ru</a><br>
                                    <mailto:<a href="mailto:stpworld@narod.ru" target="_blank">stpworld@narod.ru</a>><br>
                                    <mailto:<a href="mailto:stpworld@narod.ru" target="_blank">stpworld@narod.ru</a><br>
                                    <mailto:<a href="mailto:stpworld@narod.ru" target="_blank">stpworld@narod.ru</a>>>> wrote:<br>
<br>
                                    Hi Nick,<br>
<br>
                                    I tried to rework changes as you<br>
                                    requested. One of patches (0004<br>
                                    with extra assertions) has been<br>
removed.<br>
<br>
<br>
                                     > + bool isEquivalentType(Type<br>
                                    *Ty1, Type *Ty2) const {<br>
                                     > + return cmpType(Ty1, Ty2) == 0;<br>
                                     > + }<br>
                                     ><br>
                                     > Why do we still need<br>
                                    isEquivalentType? Can we nuke this?<br>
                                    Yup. After applying all the patches<br>
                                    isEquivalentType will be<br>
                                    totally<br>
                                    replaced with cmpType. All isEqXXXX<br>
                                    and friends will be removed in<br>
                                    0011 (old 0012). No traces left.<br>
                                    Old function wasn't removed in 0001<br>
                                    just for keeping patches<br>
                                    without<br>
                                    extra noise like:<br>
<br>
                                    - something that uses<br>
isEquivalentType<br>
                                    + something that uses cmpType<br>
<br>
                                    The point is, that "something" that<br>
                                    uses isEquivalentType, will be<br>
                                    also replaced with one of next<br>
                                    patches in this series.<br>
<br>
<br>
                                     ><br>
                                     > +static int cmpNumbers(uint64_t<br>
                                    L, uint64_t R) {<br>
                                     > + if (L < R) return -1;<br>
                                     > + if (L > R) return 1;<br>
                                     > + return 0;<br>
                                     > +}<br>
                                     ><br>
                                     > At a high level, you don't<br>
                                    actually need a <=> operator to use a<br>
                                    sort. A<br>
                                     > strict ordering ( < operator) is<br>
                                    sufficient. (Note that for<br>
                                    mergefunc, a<br>
                                     > strict weak ordering is not<br>
                                    sufficient, it must be a total<br>
                                    ordering.)<br>
                                    That could be done with int<br>
                                    FunctionComparator::compare(). We can<br>
                                    replace it with bool<br>
                                    FunctionComparator::less(). Though<br>
                                    for all<br>
                                    other cmp methods need at least 2<br>
                                    bits of information as result:<br>
                                    1. Whether things are equal.<br>
                                    2. Whether left less than right.<br>
<br>
                                    As for<br>
                                    FunctionComparator::compare(),<br>
                                    conversion into less() will<br>
                                    increase time of sanity check (patch<br>
                                    #0010).<br>
                                    Sanity check is just a sweet bonus.<br>
                                    It checks that ordering<br>
                                    implemented properly (checks order<br>
                                    relation properties).<br>
                                    Turning compare() into less() mean,<br>
                                    that we'll have to run<br>
                                    comparison two times: L.less(R) and<br>
                                    R.less(L). But may be sanity<br>
                                    check is not a thing to be published<br>
                                    at all.<br>
<br>
<br>
                                     ><br>
                                     > Consider hoisting this inside the<br>
                                    FunctionComparator class? That<br>
                                    class<br>
                                     > should have a bunch of<br>
                                    implementations of comparisons<br>
between<br>
                                    various<br>
                                     > different things, which can pass<br>
                                    down to other methods in the<br>
                                    same class.<br>
                                    In new patch series attached to this<br>
                                    post, I have moved all static<br>
                                    methods into FunctionComparator.<br>
<br>
<br>
                                     > + // Replacement for<br>
                                    type::canLosslesslyBitCastTo, that<br>
                                     > + // establish order relation on<br>
                                    this kind of properties.<br>
                                     > + int<br>
                                    checkForLosslessBitcast(const Type<br>
                                    *L, const Type *R);<br>
                                     ><br>
                                     > Type:: not type:: . Please make<br>
                                    this comment more descriptive.<br>
                                    Done.<br>
                                    [new comment]<br>
                                    Replacement for<br>
                                    Type::canLosslesslyBitCastTo, that<br>
<br>
                                    establish order relation on this<br>
                                    kind of properties<br>
                                    Returns 0, if L and R types could be<br>
                                    converted to each other<br>
                                    without<br>
                                    reinterpretation of bits.<br>
                                    Otherwise method returns -1 or 1,<br>
                                    defining total ordering between<br>
                                    types in context of lossless<br>
                                    bitcastability trait.<br>
                                    E.g.: if L is less than R (result is<br>
                                    -1), than every type that<br>
                                    could be<br>
                                    losslessly bitcasted to L is less<br>
                                    than R.<br>
                                    [/new comment]<br>
<br>
<br>
                                     ><br>
                                     > + /// Replacement for C1 ==<br>
                                    getBitCast(C2, C1Ty)<br>
                                     > + /// Its more controllable, and<br>
                                    supposed to be simpler and<br>
                                    more<br>
                                     > predictionable.<br>
                                     > + /// As very important<br>
                                    advantage: it allows to introduce<br>
order<br>
                                    relation on<br>
                                     > + /// constants set, and thus use<br>
                                    it as trait in refinement<br>
                                    routines.<br>
                                     ><br>
                                     > "Its" --> "It's".<br>
                                    "predictionable" --> "predictable".<br>
                                    And how is<br>
                                    it more<br>
                                     > predictable? I think this comment<br>
                                    would be better if it<br>
                                    described the<br>
                                     > function instead of making<br>
                                    comparisons between it and other<br>
                                    functions.<br>
                                     > Something like, "Compare<br>
                                    constants under a system where<br>
pointer<br>
                                    to X and<br>
                                     > pointer to Y are considered<br>
                                    equal" or whatever is actually true<br>
                                    here.<br>
                                    Done.<br>
                                    [new comment]<br>
<br>
                                    Replacement for C1 == getBitCast(C2,<br>
                                    C1Ty)<br>
                                    Parses constants contents, assuming<br>
                                    that types are losslessly<br>
                                    bitcasted between each other. So<br>
                                    actually it ignores types and only<br>
                                    compares bits from L and R.<br>
                                    Returns 0, if L and R has equivalent<br>
                                    content.<br>
                                    -1 or 1 if values are different.<br>
                                    Maintaining total ordering<br>
                                    requires<br>
                                    two values that indicates<br>
                                    non-equivalence (L less R, L<br>
greater R).<br>
                                    [/new comment]<br>
<br>
<br>
                                     ><br>
                                     > +enum ConstantType {<br>
                                     > I'm not sure that this buys you<br>
                                    much. All the "isa" tests can be<br>
                                    broken<br>
                                     > down into a switch on<br>
                                    getValueID() with the one<br>
exception of<br>
                                    isNullValue().<br>
                                    Done.<br>
<br>
<br>
                                     > + assert(<br>
                                     > +<br>
<br>
C1->getType()->____<u></u>canLosslesslyBitCastTo(C2->___<u></u>_getType())<br>
                                    &&<br>
                                     > + "Pass is healthless.<br>
                                    checkForLosslessBitcast should be<br>
                                    twin of "<br>
                                     > + "canLosslesslyBitCastTo method,<br>
                                    except the case when the<br>
                                    last one "<br>
                                     > + "returns false, the first one<br>
                                    should return -1 or 1");<br>
                                    ...<br>
<br>
                                     > I think we can skip the asserts<br>
                                    here. They aren't detecting a<br>
                                    specific<br>
                                     > bug, they're checking whether the<br>
                                    new code does a certain task<br>
                                    relative<br>
                                     > to the old code. Drop the old<br>
                                    code, your new code is the new<br>
                                    sheriff in<br>
                                     > town.<br>
                                    Asserts has been removed.<br>
<br>
<br>
                                     ><br>
                                     > + DenseMap<const Value*, int><br>
                                    sn_map1, sn_map2;<br>
                                     ><br>
                                     > What is sn short for ("seen")?<br>
                                    Why are there two of these?<br>
                                    Serial number :-)<br>
<br>
                                     ><br>
                                     > + std::pair<DenseMap<const Value<br>
                                    *, int>::iterator, bool><br>
                                     > + LeftSN =<br>
                                    sn_map1.insert(std::make_pair(<u></u>____V1,<br>
                                    sn_map1.size())),<br>
                                     > + RightSN =<br>
                                    sn_map2.insert(std::make_pair(<u></u>____V2,<br>
                                    sn_map2.size()));<br>
                                     ><br>
                                     > So I think I get it, this is easy<br>
                                    to reason about. You number<br>
                                    each value<br>
                                     > going down both functions. But I<br>
                                    think you can eliminate one of<br>
                                    these<br>
                                     > maps because you know that if the<br>
                                    left and right were ever<br>
                                    different<br>
                                     > then we terminate the comparison<br>
                                    immediately. You can at least<br>
                                    share one<br>
                                     > map with both V1 and V2, but I<br>
                                    think you can reduce the number<br>
                                    of map<br>
                                     > insertions.<br>
                                    Not sure. Consider that in left you<br>
                                    have:<br>
                                    %A = alloca i32<br>
                                    %B = alloca i32<br>
                                    store i32 1, i32* %A<br>
<br>
                                    And in right:<br>
                                    %A = alloca i32<br>
                                    %B = alloca i32<br>
                                    store i32 1, i32* %B<br>
<br>
                                    When we meet 'store' instruction we<br>
                                    need to check in which order %A<br>
                                    was allocated at left and in which<br>
                                    order %B was allocated at right.<br>
                                    So for each value in function we<br>
                                    have to assign serial number. Then<br>
                                    comparing to local values from<br>
                                    different functions we can just<br>
                                    check<br>
                                    their serial numbers.<br>
                                    Though, may be I missed something?<br>
                                    May be there is another way to<br>
                                    determine "who was first".<br>
<br>
<br>
                                     > High-level question, are<br>
                                    attributes ordered? Your<br>
comparison is<br>
                                    ordered.<br>
                                     > So if I have function F with<br>
                                    string attributes "sse2",<br>
                                    "gc=hemisphere"<br>
                                     > and another function G with<br>
                                    string attributes "gc=hemisphere",<br>
                                    "sse2"<br>
                                     > then they would be considered</blockquote></blockquote>
</blockquote></div><br></div></div>