[LLVMdev] MergeFunctions: reduce complexity to O(log(N))

Stepan Dyatkovskiy stpworld at narod.ru
Fri Jun 13 15:47:45 PDT 2014


ping
Stepan Dyatkovskiy wrote:
> Hi Nick,
>
> What about rest of patches? I rescanned remained patches, updated them
> (according to trunk). I have also added few more comments in 0010 about
> how finally we introduce binary tree and N*log(N) time.
> All the patches are reattached to this post.
>
> Thanks!
>
> -Stepan.
>
> Stepan Dyatkovskiy wrote:
>> Hi Nick,
>> Thanks for reviews!
>>
>> I have committed 0003 - 0005 as 208953,    208973 and 208976.
>>
>>  > +        // TODO: Already checked in cmpOps
>> Yup. I have fixed it with "Already checked in cmpOperation".
>> The reason I put this TODO, is because we don't need this check,
>> operation types were already checked in cmpOperation. But I didn't
>> remove this code, because its not mine (it was in old code as well), and
>> actually it is subject of additional cosmetic commit.
>>
>> Thanks!
>> -Stepan
>>
>> Nick Lewycky wrote:
>>> 0003:
>>>
>>> +  // Mostly cloned from BitcodeWriter, but simplified a bit.
>>>
>>> Please remove that comment, it isn't adding anything for the reader.
>>>
>>> And submit.
>>>
>>> 0004:
>>>
>>> I *really* like the comment block!! This is going to save developers so
>>> much time!
>>>
>>> No changes, please submit!
>>>
>>> 0005:
>>>
>>> Perfect. Please submit.
>>>
>>> 0006:
>>>
>>> +        // TODO: Already checked in cmpOps
>>>
>>> cmpOps?
>>>
>>> Nick
>>>
>>> On 15 May 2014 06:32, Stepan Dyatkovskiy <stpworld at narod.ru
>>> <mailto:stpworld at narod.ru>> wrote:
>>>
>>>     Hi Nick,
>>>     What about rest of patches? I have fixed 0004 and 0005, others seems
>>>     to be OK for you.. Can I commit them?
>>>
>>>     Thanks!
>>>     -Stepan
>>>
>>>     Stepan Dyatkovskiy wrote:
>>>
>>>         Hi Nick,
>>>         Thanks you for reviews. I have corrected for of comments from
>>> FAQ to
>>>         what seems to be regular in LLVM.
>>>         Patches has been committed as r208173 and r208189.
>>>
>>>         Thanks!
>>>         -Stepan
>>>
>>>         Nick Lewycky wrote:
>>>
>>>             On 29 April 2014 04:36, Stepan Dyatkovskiy
>>>             <stpworld at narod.ru <mailto:stpworld at narod.ru>
>>>             <mailto:stpworld at narod.ru <mailto:stpworld at narod.ru>>>
>>> wrote:
>>>
>>>                  Hi Nick.
>>>
>>>                  I have fixed and reattached all the patches from 0001
>>>             till 0011.
>>>
>>>                  0001, 0002:
>>>                  I have added detailed order descriptions to the methods
>>>             declaration.
>>>                  I have also fixed everything as you asked.
>>>
>>>
>>>             0001:
>>>
>>>             +    for (uint64_t i = 0; i < NumElementsL; ++i) {
>>>             vs.
>>>             +    for (unsigned i = 0, e = NumElementsL; i != e; ++i) {
>>>
>>>             This difference is necessary to avoid a warning? I don't
>>>             actually care
>>>             which form you use here given that NumElementsL is already
>>>             an integer
>>>             there's no recalculation we're avoiding like getting the end
>>>             iterator
>>>             each time, I just want it to look consistent. If it's
>>>             working around a
>>>             warning that's more important than consistent looks.
>>>
>>>             0001 looks good. Please commit it.
>>>
>>>             0002:
>>>
>>>             One note, you use a writing style of asking a question,
>>>             followed by
>>>             answering it. For instance, "Will we still get the same
>>>             numbering? As
>>>             follows from FunctionComparator::compare(), we do CFG walk,
>>>             ...". I
>>>             don't think it's worth your time to rewrite it now, but we
>>>             generally
>>>             don't use that construction in llvm's comments, instead
>>>             preferring to
>>>             explain what the code does an including an argument that
>>>             explains why
>>>             the approach is correct. This amounts to the same thing,
>>>             it's just a
>>>             different style.
>>>
>>>             +  ///          their functions. If right value was meet
>>>             first during
>>>             scanning,
>>>
>>>             "was meet" --> "was met"
>>>
>>>             Please commit it.
>>>
>>>             Nick
>>>
>>>
>>>                  0003, cmpStrings
>>>                  StringRef::compare performs lexicographical comparison,
>>>             and it isn't
>>>                  the fastest one.
>>>                  In case, when strings has different length, it still
>>>             compares their
>>>                  prefixes lexicographically.
>>>                  We need ordering among strings set, but we don't need
>>>             to compare
>>>                  strings contents lexicographically all the time. So we
>>>             can compare
>>>                  their sizes first, and do lexicographical comparison
>>>             only when we
>>>                  really need it: only when strings are equal in size.
>>>
>>>                  I have added short version of these explanations into
>>>             cmpStrings. So
>>>                  we could get really well documented methods based on
>>>             our posts ;-)
>>>                  Those were good questions and definitely the answers
>>>             should be in
>>>                  documentation somewhere.
>>>
>>>                  0004: I have added ordering description.
>>>                  0005: Fixed.
>>>                  0007: Fixed.
>>>                  0009:
>>>
>>>                   > Okay, I'm confused. We do all this work to produce a
>>>             3-way
>>>                  comparison,
>>>                   > then we wrap it up in an operator< for a std::set
>>>             that will only
>>>                  look at
>>>                   > less-than...
>>>                  Actually, on top level we don't need 3-way comparison.
>>>             But we do
>>>                  need it to determine whether functions are equal on
>>>             some stage, just
>>>                  to proceed to the next "less significant" stage.
>>>                  Consider two functions (where BB1 and BB11 are equal)
>>>
>>>                  "F1: {BB1, BB2}" and
>>>                  "F2: {BB11, BB3}".
>>>
>>>                  First we have to realize that BB1 and BB11 are equal.
>>>             And then
>>>                  determine which one of BB2 and BB3 is less than
>>> another.
>>>                  So finally we got 3-way comparison: less, equal and
>>>             greater (in fact
>>>                  just neither less nor equal).
>>>
>>>
>>>                   >  When we start detecting equal functions, do we
>>>             merge them
>>>                  immediately? If so, can't that cause functions already
>>>             in the set to
>>>                  have their equality comparisons change
>>>                  Yes we merge them immediately. And no, it won't cause
>>>             situation you
>>>                  have mentioned. Once we changed function X, we lookup
>>>             for functions
>>>                  that uses X, remove them from FnTree (former FnSet) and
>>>             put them
>>>                  into Deferred list. Then we fill up worklist with
>>>             Deferred contents
>>>                  and repeat comparison routine for these functions.
>>>
>>>
>>>                  Thanks for reviews!
>>>
>>>                  -Stepan
>>>
>>>                  Stepan Dyatkovskiy wrote:
>>>
>>>                      ping
>>>                      Stepan Dyatkovskiy wrote:
>>>
>>>                          Hi Nick,
>>>                          Thank you for reviews! Here is my answers for
>>>             first two
>>>             patches.
>>>
>>>                          Though as additional, I think its important
>>>             answer to this
>>>                          question
>>>                          first:
>>>                            > Okay, I'm confused. We do all this work to
>>>             produce a
>>>                          3-way comparison,
>>>                            > then we wrap it up in an operator< for a
>>>             std::set that
>>>                          will only
>>>                          look at
>>>                            > less-than...
>>>                          Actually, on top level we don't need 3-way
>>>             comparison. But
>>>                          we do need it
>>>                          to determine whether functions are equal on
>>>             some stage just
>>>                          to proceed
>>>                          to the next "less meaningful" stage.
>>>                          Consider two functions
>>>                          "F1: {BB1, BB2}" and
>>>                          "F2: {BB-equal-to-BB1, BB3}".
>>>                          First we have to realize that BB1 and
>>>             BB-equal-to-BB1 are
>>>                          equal. And
>>>                          then determine which one of BB2 and BB3 is
>>> less.
>>>
>>>                          First patch.
>>>
>>>                            > +  /// 1. Bitcastability.
>>>                            > +  /// Check whether L's type could be
>>>             losslessly
>>>                          bitcasted to R's
>>>                          type.
>>>                            > +  /// On this stage method, in case when
>>>             lossless
>>>                          bitcast is not
>>>                          possible
>>>                            > +  /// method returns -1 or 1, thus also
>>>             defining which
>>>                          type is
>>>                          greater in
>>>                            > +  /// context of bitcastability.
>>>                          I have added detailed order description for
>>>             this method. You
>>>                          also can
>>>                          find this description in
>>> cmpConstantsComment.txt.
>>>
>>>                            > Optional: "Return its pointers comparison
>>>             result" ->
>>>                          "Return the
>>>                          result
>>>                            > of comparing the types"?
>>>                          Oops! Fixed.
>>>
>>>                            > +    if (TyL->getTypeID() ==
>>> Type::X86_MMXTyID)
>>>                            > +      TyLWidth = 64;
>>>                            > +    else if (const VectorType *VecTyL =
>>>                          dyn_cast<VectorType>(TyL))
>>>                            > +      TyLWidth = VecTyL->getBitWidth();
>>>                            >
>>>                            > Yikes! We never merge function A using a
>>>             vector and
>>>                          function B
>>>                          using an
>>>                            > x86_mmx type together, right?
>>>                          Fixed!
>>>
>>>                            > "Nor of TyL and TyR" -> "Neither Tyl nor
>>> TyR".
>>>                            >
>>>                            > Optional: "Return its pointers comparison
>>>             result" ->
>>>                          "Return the
>>>                          result
>>>                            > of comparing the types"?
>>>                          Fixed.
>>>
>>>                            > +  case Value::UndefValueVal: return
>>>             cmpType(L->getType(),
>>>                          R->getType());
>>>                            >
>>>                            > You already have L->getType() and
>>>             R->getType() in TyL
>>>                          and TyR.
>>>                          Fixed. Sorry.
>>>
>>>                          Second patch.
>>>
>>>                            > I'm curious whether we guarantee that
>>>                            > we get the same numbering when comparing
>>>             two functions
>>>                          if I simply
>>>                            > reorder the BasicBlocks in one of the
>>>             functions, will we
>>>                          still assign
>>>                            > the same numberings or not?
>>>                          ...
>>>                            > Why not just start at the beginning and
>>> run to
>>>                            > the end, relying on the SSA dominance
>>> property.
>>>                          Good questions!
>>>
>>>                          For first one I have added these notes:
>>>                          [quote]
>>>                          As follows from FunctionComparator::compare(),
>>>             we do CFG
>>>                          walk, we start
>>>                          from entry, and then take each terminator. So
>>>             it doesn't
>>>                          matter how in
>>>                          fact BBs are ordered in function. And since
>>>             cmpValues are
>>>                          called during
>>>                          this walk, the numbering depends only on how
>>>             BBs located
>>>                          inside the CFG.
>>>                          [/quote]
>>>
>>>                          As for second question, I have added next
>>> notes:
>>>
>>>                          [quote]
>>>                          We also can't use dominance properties.
>>>             Consider situation:
>>>                          If we compare two instruction operands: first
>>>             is usage of
>>>                          local variable
>>>                          AL from function FL, and second is usage of
>>>             local variable
>>>                          AR from FR.
>>>                          We are still not able to compare operands of
>>>             PHI, since
>>>                          those could be
>>>                          operands from further BBs we didn't scan yet.
>>>                          [/quote]
>>>
>>>                          You can find comments also in
>>> sn_mapLRComment.txt
>>>                          I have also described lexicographical ordering
>>>             for cmpValues
>>>                          in method
>>>                          comments. I have also put them in separated txt
>>>             file:
>>>                          cmpValuesComment.txt.
>>>
>>>                            > +    if (L == R) return 0;
>>>                            > +    // Compare C1 and the bitcast result.
>>>                            > +    if (int Res = cmpConstants(ConstL,
>>>             ConstR))
>>>                            > +      return Res;
>>>                            > [...]
>>>                            > +    return 0;
>>>                            >
>>>                            > Isn't that just:
>>>                            >    return cmpConstants(ConstL, ConstR);
>>>                          OOOPS! Fixed.
>>>
>>>                          I'm working on rest of patches.
>>>                          Thanks!
>>>
>>>                          -Stepan.
>>>
>>>                          Nick Lewycky wrote:
>>>
>>>                              Stepan Dyatkovskiy wrote:
>>>
>>>                                  Hello Nick!
>>>
>>>                                  Can we proceed to work on patch series?
>>>
>>>
>>>                                  Just in case, I have reattached them to
>>>             this post.
>>>                                  Would you prefer to see them in
>>>             http://reviews.llvm.org ?
>>>
>>>
>>>                              Oddly, I prefer attachments, but we can use
>>>             reviews.llvm.org <http://reviews.llvm.org>
>>>             <http://reviews.llvm.org> if you
>>>                              prefer?
>>>
>>>                              0001-constants-comparison.____patch
>>>
>>>                              +  /// 1. Bitcastability.
>>>                              +  /// Check whether L's type could be
>>>             losslessly
>>>                              bitcasted to R's type.
>>>                              +  /// On this stage method, in case when
>>>             lossless
>>>                              bitcast is not
>>>                              possible
>>>                              +  /// method returns -1 or 1, thus also
>>>             defining which
>>>                              type is
>>>                              greater in
>>>                              +  /// context of bitcastability.
>>>
>>>                              Looking at the code, it looks like the
>>>             order is:
>>>                                  1. first class types, by typeid
>>>                                  2. vectors (by element size? by number
>>>             of elements?
>>>                              the two
>>>                              multiplied together?)
>>>
>>>                              +      // Nor of TyL and TyR are values of
>>>             first class
>>>                              type. Return
>>>                              +      // its pointers comparison result.
>>>
>>>                              "Nor of TyL and TyR" -> "Neither Tyl nor
>>> TyR".
>>>
>>>                              Optional: "Return its pointers comparison
>>>             result" ->
>>>                              "Return the result
>>>                              of comparing the types"?
>>>
>>>                              +    if (TyL->getTypeID() ==
>>> Type::X86_MMXTyID)
>>>                              +      TyLWidth = 64;
>>>                              +    else if (const VectorType *VecTyL =
>>>                              dyn_cast<VectorType>(TyL))
>>>                              +      TyLWidth = VecTyL->getBitWidth();
>>>
>>>                              Yikes! We never merge function A using a
>>>             vector and
>>>                              function B using an
>>>                              x86_mmx type together, right? Note what the
>>>             LangRef says
>>>                              about x86_mmx:
>>>                              "The operations allowed on it are quite
>>>             limited:
>>>                              parameters and return
>>>                              values, load and store, and bitcast."
>>>
>>>                              +  case Value::UndefValueVal: return
>>>             cmpType(L->getType(),
>>>                              R->getType());
>>>
>>>                              You already have L->getType() and
>>>             R->getType() in TyL
>>>                              and TyR. You also
>>>                              already have cmpType(TyL, TyR) in TypesRes.
>>>             Please just
>>>                              reuse them in
>>>                              the rest of this function.
>>>
>>>
>>>
>>> 0002-values-comparison-____replacement-for-enumerate.____patch
>>>
>>>                              +  // Assign serial numbers to values
>>> from left
>>>                              function, and values
>>>                              from
>>>                              +  // right function.
>>>                              +  // Explanation:
>>>                              +  // Being comparing functions we need to
>>>             compare
>>>                              values we meet at
>>>                              left and
>>>                              +  // right sides.
>>>                              +  // Its easy to sort things out for
>>>             external values.
>>>                              It just should be
>>>                              +  // the same value at left and right.
>>>                              +  // But for local values (those were
>>>             introduced inside
>>>                              function body)
>>>                              +  // we have to ensure they were
>>>             introduced at exactly
>>>                              the same place,
>>>                              +  // and plays the same role. How to do
>>> it?
>>>                              +  // Just assign serial number to each
>>>             value when we
>>>                              meet it first
>>>                              time.
>>>                              +  // Values that were met at same place
>>>             will be with
>>>                              same serial
>>>                              numbers.
>>>
>>>                              This explanation is correct, but I'm
>>>             curious whether we
>>>                              guarantee that
>>>                              we get the same numbering when comparing
>>>             two functions
>>>                              if I simply
>>>                              reorder the BasicBlocks in one of the
>>>             functions, will we
>>>                              still assign
>>>                              the same numberings or not?
>>>
>>>                              Also, it's not clear from the comment why
>>>             we need the
>>>                              complexity of
>>>                              numbering them anyways. Why not just start
>>>             at the
>>>                              beginning and run to
>>>                              the end, relying on the SSA dominance
>>>             property to
>>>                              guarantee that we
>>>                              visit definitions before uses, and assume
>>>             we'll visit
>>>                              each value in the
>>>                              same order (look at each instruction and
>>>             operand in
>>>                              order and just
>>>                              complain about differences)? The answer is
>>>             that PHI
>>>                              nodes mean we will
>>>                              visit some uses before some definitions.
>>>
>>>                              Anyways, changing this text is optional in
>>>             my mind. I
>>>                              think it could be
>>>                              better, but it's not worth spending much
>>>             time on it.
>>>
>>>                              +int
>>>             FunctionComparator::cmpValues(____const Value *L,
>>>                              const Value *R) {
>>>
>>>                              I wouldn't mind an explanation of the
>>>             lexographic
>>>                              ordering on this one
>>>                              either. Constants before non-constants,
>>>             then InlineAsm
>>>                              (ordered by
>>>                              address), then visit order.
>>>
>>>                              +    if (L == R) return 0;
>>>                              +    // Compare C1 and the bitcast result.
>>>                              +    if (int Res = cmpConstants(ConstL,
>>>             ConstR))
>>>                              +      return Res;
>>>                              [...]
>>>                              +    return 0;
>>>
>>>                              Isn't that just:
>>>                                  return cmpConstants(ConstL, ConstR);
>>>                              ?
>>>
>>>                              0003-attributes-comparison.____patch
>>>
>>>                              +int
>>>             FunctionComparator::____cmpStrings(StringRef L,
>>>                              StringRef R) const {
>>>                              +  // Prevent heavy comparison, compare
>>>             sizes first.
>>>                              +  if (int Res = cmpNumbers(L.size(),
>>>             R.size()))
>>>                              +    return Res;
>>>                              +
>>>                              +  return L.compare(R);
>>>                              +}
>>>
>>>                              Optional: I'm curious whether that's
>>>             actually worth it?
>>>                              I checked
>>>                              StringRef and it does the length comparison
>>>             last. It
>>>                              could be that you
>>>                              know that your strings tend to be longer
>>>             than StringRef
>>>                              which tries to
>>>                              be efficient in general.
>>>
>>>                              0004-operations-comparison.____patch
>>>
>>>                              +int
>>> FunctionComparator::____cmpOperation(const
>>>                              Instruction *L,
>>>
>>>                              Optional: Again, a description of the
>>>             ordering would be
>>>                              interesting.
>>>                              Opcode number, then number of operands,
>>>             then type, then
>>>                              'nsw/nuw/exact/volatile'.
>>>
>>>                              0005-GEP-comparison.patch
>>>
>>>                              +int FunctionComparator::cmpGEP(____const
>>>             GEPOperator
>>>             *GEPL,
>>>                              +                               const
>>>             GEPOperator
>>>             *GEPR) {
>>>
>>>                              +  unsigned int ASL =
>>>             GEPL->getPointerAddressSpace()____;
>>>
>>>                              Extra line after {
>>>
>>>                                   if (DL) {
>>>                              [...]
>>>                              +    unsigned BitWidth = DL ?
>>>                              DL->getPointerSizeInBits(ASL) : 1;
>>>
>>>                              BitWidth doesn't need to test DL for
>>> null-ness.
>>>
>>>                              0006-top-level-compare-method.____patch
>>>
>>>                              Perfect!
>>>
>>>                              0007-sanity-check.patch
>>>
>>>                              +          // Check symmetricity.
>>>
>>>                              "symmetricity" -> "symmetry".
>>>
>>>                              0008-removed-unused-methods.____patch
>>>
>>>                              Perfect
>>>
>>>                              0009-FnSet-replaced-with-____FnTree.patch
>>>
>>>                              Okay, I'm confused. We do all this work to
>>>             produce a
>>>                              3-way comparison,
>>>                              then we wrap it up in an operator< for a
>>>             std::set that
>>>                              will only look at
>>>                              less-than, but never uses the equals
>>>             result. We want to
>>>                              do one
>>>                              comparison between two functions, not two
>>>             (ie., std::set
>>>                              does
>>>                              func1<func2 and func2<func1 to see whether
>>>             they're equal
>>>                              or less than,
>>>                              and we don't need that since cmp(func1,
>>>             func2) tells us
>>>                              directly). Maybe
>>>                              this comes in a future patch?
>>>
>>>                              One other question, I should probably know
>>>             the answer to
>>>                              this but it's
>>>                              been a while. :) When we start detecting
>>> equal
>>>                              functions, do we merge
>>>                              them immediately? If so, can't that cause
>>>             functions
>>>                              already in the set
>>>                              to have their equality comparisons change
>>>             (ie., A calls
>>>                              X, B calls Y, A
>>>                              and B are in the set, then we merge X and Y
>>>             for being
>>>                              equal, now we've
>>>                              changed A and B to both called the same
>>>             merged function,
>>>                              so they start
>>>                              to compare the same). Sets tend not to like
>>>             having their
>>>                              comparison
>>>                              functions change on elements that they
>>> contain.
>>>
>>>                              0010-updated-comments.patch
>>>
>>>                              Okay.
>>>
>>>                              0011-removed-DenseMap-helpers.____patch
>>>
>>>                              Yay deleting dead code!
>>>
>>>                              Nick
>>>
>>>
>>>                                  Thanks!
>>>                                  -Stepan.
>>>
>>>                                  Stepan Dyatkovskiy wrote:
>>>
>>>                                      ping
>>>                                      Stepan Dyatkovskiy wrote:
>>>
>>>                                          Hi Nick,
>>>
>>>                                          Please find the fixed patches
>>>             in attachment.
>>>                                          Series starts from "constants
>>>             comparison".
>>>
>>>                                          Below small report of what has
>>>             been fixed,
>>>                                          and answers on your
>>>                                          questions.
>>>
>>>                                          cmpTypes:
>>>                                           > Please add a comment for
>>>             this method.
>>>                                          Include the meaning of the
>>>                                          returned value. ("man strcmp"
>>> for
>>>             inspiration.)
>>>                                          Fixed and committed. So you can
>>>             look in
>>>                                          trunk, may be I forgot to do
>>>                                          something (hope not :-) )
>>>
>>>                                          checkForLosslessBitcast,
>>>             cmpConstants:
>>>                                           > Why isn't this using
>>> cmpType?
>>>                                          Fixed.
>>>
>>>                                           > Please put the else on the
>>>             same line as
>>>                                          the closing brace.
>>>                                          Fixed.
>>>
>>>                                           >> + else if (const VectorType
>>>             *thisPTy =
>>>                                          dyn_cast<VectorType>(L))
>>>                                           > Missing initial capital
>>>                                          Sorry. Fixed. Actually these
>>>             typos has been
>>>                                          cloned from
>>>                                          Type::canLosslesslyBitCastTo.
>>>
>>>                                           >> + return
>>>             cmpNumbers((uint64_t)L,
>>>                                          (uint64_t)R);
>>>                                           >>
>>>                                           > Please assert on unknown
>>>             constant.
>>>                                          That's possible. But what if we
>>>             really got
>>>                                          unknown constant? New
>>>                                          comming
>>>                                          constant types merely
>>> depends on
>>>                                          MergeFunctions implementation.
>>>             So we
>>>                                          get crash if it will happen.
>>>             And we loose
>>>                                          nothing comparing them
>>>                                          just as
>>>                                          pointers. So, do we really
>>> need an
>>>                                          assertion? Currently I kept it
>>>                                          as it
>>>                                          was. If your answer is "yes, we
>>>             need it", it
>>>                                          would easy to add it:
>>>
>>>                                          case Value::FunctionVal:
>>>                                          case Value::GlobalVariableVal:
>>>                                          - case Value::GlobalAliasVal:
>>>                                          - default: // Unknown constant
>>>                                          - return
>>> cmpNumbers((uint64_t)L,
>>>             (uint64_t)R);
>>>                                          + case Value::GlobalAliasVal:
>>>                                          + return
>>> cmpNumbers((uint64_t)L,
>>>             (uint64_t)R);
>>>                                          + default:
>>>                                          + llvm_unreachable("Unknown
>>>             constant.");
>>>
>>>                                          About variable names: Var1,
>>> Var2 vs
>>>             VarL,VarR.
>>>                                          I think its better to use L/R
>>>             concept. Easer
>>>                                          to understand what to
>>>                                          return (-1, or 1) when you see
>>>             L/R rather
>>>                                          than Var1/Var2.
>>>                                          Var1/Var2 has been kept for
>>>             functions that
>>>                                          are to be removed till the
>>>                                          end of patch series.
>>>                                          F1 and F2 names were also kept
>>>             till the "top
>>>                                          level compare method"
>>>                                          patch.
>>>
>>>                                           > Alternatively, any reason
>>>             not to have
>>>                                          cmpPointers(const void *L,
>>>                                          const void *R)?
>>>                                          I could do it. I just wanted to
>>>             show that
>>>                                          pointers are compared
>>>                                          like a
>>>                                          numbers in some situations.
>>>             Actually we do
>>>                                          it in cases when we have
>>>                                          absolutely no idea of smarter
>>>             comparison.
>>>                                          And back to cmpPonters. Its
>>>             rather about
>>>                                          what intuition tells. If
>>>                                          pointers are equal as numbers,
>>>             could they be
>>>                                          different somehow? Could
>>>                                          they be non-castable to numbers
>>>             on some
>>>                                          platforms? The last one is
>>>                                          big
>>>                                          trouble for DenseMapInfo
>>>             implementation..
>>>                                          If there is any (even very
>>> small)
>>>                                          possibility of such cases -
>>> then
>>>                                          yes,
>>>                                          I vote for cmpPointers. The
>>>             last word is up
>>>                                          to you anyway.
>>>
>>>                                          Attributes (0005):
>>>                                           > Attributes already have
>>>             operator< and
>>>                                          operator==. Please reuse
>>>                                          them.
>>>                                          Fixed. I used simple "if":
>>>
>>>                                          if (LA < RA)
>>>                                          return -1;
>>>                                          if (RA < LA)
>>>                                          return 1;
>>>
>>>                                          Though its possible to use:
>>>
>>>                                          if (int Res = (LA < RA) ? -1 :
>>>             (RA < LA) ? 1
>>>                                          : 0)
>>>                                          return Res;
>>>
>>>                                          Which one is more preferable?
>>>
>>>                                          cmpGEP (0007):
>>>                                           > + int cmpGEP(const
>>>             GEPOperator *GEP1,
>>>                                          const GEPOperator *GEP2);
>>>                                           > + int cmpGEP(const
>>>             GetElementPtrInst
>>>             *GEP1,
>>>                                           > + const GetElementPtrInst
>>>             *GEP2) {
>>>                                           >
>>>                                           > Const members?
>>>                                          We can't make it constant,
>>>             since it compares
>>>                                          values (cmpValues, aka
>>>                                          enumerate). So, if we see Value
>>>             first time,
>>>                                          we have to add it into
>>>                                          sn_mapL/R.
>>>
>>>                                           > I think you should put this
>>>             under if (DL)
>>>                                          { /* declare Offset1,
>>>                                          Offset2, etc. */ }
>>>                                          Fixed.
>>>
>>>                                          About 0008:
>>>                                           > Did you mean "cmpOperation"?
>>>                                          Yes. We could keep it in one
>>>             place. I have
>>>                                          fixed this case and
>>>                                          removed
>>>                                          TODO.
>>>
>>>                                           > "compare" still returns
>>>             "bool". I'm going
>>>                                          to assume this was
>>>                                          meant to
>>>                                          go in 0009.
>>>                                          Yes.
>>>
>>>                                          About 0011:
>>>                                           > std::set is frowned upon,
>>> see
>>>
>>>             http://llvm.org/docs/____ProgrammersManual.html#set
>>>             <http://llvm.org/docs/__ProgrammersManual.html#set>
>>>
>>>             <http://llvm.org/docs/__ProgrammersManual.html#set
>>>             <http://llvm.org/docs/ProgrammersManual.html#set>>
>>>                                          . Do you actually
>>>                                          rely
>>>                                          on the stable iterator
>>>             guarantee? Or would
>>>                                          another set-like container
>>>                                          be a
>>>                                          better fit?
>>>                                          Huh.. I have looked for
>>>             alternatives.
>>>                                          Unfortunately SmallSet is not
>>>                                          suitable, since we need to
>>>             lookup existing
>>>                                          items, and SmallSet::find
>>>                                          method is not present. May be I
>>>             have missed
>>>                                          something.
>>>                                          Actually we need binary tree
>>>             implementation.
>>>                                          Do we have better
>>>                                          analogue?
>>>                                          I could implement new one.
>>>             Think it should
>>>                                          be just one more patch
>>>                                          afterwards.
>>>
>>>                                           >No. An identifier with
>>>             underscore then
>>>                                          capital letter is reserved
>>>                                          for
>>>                                          the implementation. You can
>>>             just call them
>>>                                          "F" and "DL", then the
>>>                                          ctor
>>>                                          initializer list can be "F(F),
>>>             DL(DL)" and
>>>                                          that will work fine.
>>>                                          Fixed.
>>>
>>>                                          0013:
>>>                                           > Sorry, I'm not sure what
>>>             this sentence
>>>                                          means? The thing your
>>>                                          example
>>>                                          is talking about is showing is
>>>             that two
>>>                                          functions in an SCC may be
>>>                                          equivalent, and detecting them
>>>             requires
>>>                                          "blinding" (ignoring) certain
>>>                                          details of the function when
>>>             you do the
>>>                                          comparison. We blind
>>>                                          ourselves
>>>                                          to the pointee types, but not
>>>             to callees of
>>>                                          functions in the same
>>>                                          SCC.
>>>
>>>                                          I meant generalising of
>>>             cross-reference case:
>>>                                          in old implementation, while
>>>             comparing F and
>>>                                          G functions, we treat as
>>>                                          equal case when F uses G, with
>>>             case when G
>>>                                          uses F (at the same
>>>                                          place).
>>>                                          It could happen that F uses G1,
>>>             while G1
>>>                                          uses G2, while G2 uses F.
>>>                                          So it
>>>                                          still the same. But currently
>>>             we have to
>>>                                          invent way how to detect
>>>                                          such
>>>                                          cases.
>>>
>>>                                          Thanks for reviews!
>>>                                          -Stepan
>>>
>>>                                          Stepan Dyatkovskiy wrote:
>>>
>>>                                              Hi Nick,
>>>                                              I have committed 0001 as
>>>             r203788.
>>>                                              I'm working on fixes for
>>>             0002 - 0014.
>>>
>>>                                               > After reading through
>>>             this patch
>>>                                              series, I feel like I'm
>>> missing
>>>                                               > something important.
>>>             Where's the sort
>>>                                              function? It looks like
>>>                                              we're
>>>                                               > still comparing all
>>>             functions to all
>>>                                              other functions.
>>>                                              When you insert functions
>>>             into std::set
>>>                                              or its analo g s it does
>>> all
>>>                                              the
>>>                                              job for you. Since
>>>             internally it builds
>>>                                              a binary tree using "less"
>>>                                              comparison, and each
>>>             insert/look-up
>>>                                              operation takes O(log(N))
>>> time.
>>>
>>>                                              -Stepan.
>>>
>>>                                              Nick Lewycky wrote:
>>>
>>>                                                  On 27 February 2014
>>>             08:23, Stepan
>>>                                                  Dyatkovskiy
>>>             <stpworld at narod.ru <mailto:stpworld at narod.ru>
>>>
>>>             <mailto:stpworld at narod.ru <mailto:stpworld at narod.ru>>
>>>
>>>             <mailto:stpworld at narod.ru <mailto:stpworld at narod.ru>
>>>
>>>             <mailto:stpworld at narod.ru <mailto:stpworld at narod.ru>>>>
>>> wrote:
>>>
>>>                                                  Hi Nick,
>>>
>>>                                                  I tried to rework
>>>             changes as you
>>>                                                  requested. One of
>>>             patches (0004
>>>                                                  with extra assertions)
>>>             has been
>>>             removed.
>>>
>>>
>>>                                                   > + bool
>>>             isEquivalentType(Type
>>>                                                  *Ty1, Type *Ty2)
>>> const {
>>>                                                   > + return
>>>             cmpType(Ty1, Ty2) == 0;
>>>                                                   > + }
>>>                                                   >
>>>                                                   > Why do we still need
>>>                                                  isEquivalentType? Can
>>>             we nuke this?
>>>                                                  Yup. After applying all
>>>             the patches
>>>                                                  isEquivalentType
>>> will be
>>>                                                  totally
>>>                                                  replaced with cmpType.
>>>             All isEqXXXX
>>>                                                  and friends will be
>>>             removed in
>>>                                                  0011 (old 0012). No
>>>             traces left.
>>>                                                  Old function wasn't
>>>             removed in 0001
>>>                                                  just for keeping
>>> patches
>>>                                                  without
>>>                                                  extra noise like:
>>>
>>>                                                  - something that uses
>>>             isEquivalentType
>>>                                                  + something that uses
>>>             cmpType
>>>
>>>                                                  The point is, that
>>>             "something" that
>>>                                                  uses isEquivalentType,
>>>             will be
>>>                                                  also replaced with one
>>>             of next
>>>                                                  patches in this series.
>>>
>>>
>>>                                                   >
>>>                                                   > +static int
>>>             cmpNumbers(uint64_t
>>>                                                  L, uint64_t R) {
>>>                                                   > + if (L < R)
>>> return -1;
>>>                                                   > + if (L > R)
>>> return 1;
>>>                                                   > + return 0;
>>>                                                   > +}
>>>                                                   >
>>>                                                   > At a high level, you
>>>             don't
>>>                                                  actually need a <=>
>>>             operator to use a
>>>                                                  sort. A
>>>                                                   > strict ordering ( <
>>>             operator) is
>>>                                                  sufficient. (Note
>>> that for
>>>                                                  mergefunc, a
>>>                                                   > strict weak ordering
>>>             is not
>>>                                                  sufficient, it must be
>>>             a total
>>>                                                  ordering.)
>>>                                                  That could be done
>>> with int
>>>
>>>             FunctionComparator::compare(). We can
>>>                                                  replace it with bool
>>>
>>>             FunctionComparator::less(). Though
>>>                                                  for all
>>>                                                  other cmp methods need
>>>             at least 2
>>>                                                  bits of information as
>>>             result:
>>>                                                  1. Whether things are
>>>             equal.
>>>                                                  2. Whether left less
>>>             than right.
>>>
>>>                                                  As for
>>>
>>>             FunctionComparator::compare(),
>>>                                                  conversion into
>>> less() will
>>>                                                  increase time of sanity
>>>             check (patch
>>>                                                  #0010).
>>>                                                  Sanity check is just a
>>>             sweet bonus.
>>>                                                  It checks that ordering
>>>                                                  implemented properly
>>>             (checks order
>>>                                                  relation properties).
>>>                                                  Turning compare() into
>>>             less() mean,
>>>                                                  that we'll have to run
>>>                                                  comparison two times:
>>>             L.less(R) and
>>>                                                  R.less(L). But may be
>>>             sanity
>>>                                                  check is not a thing to
>>>             be published
>>>                                                  at all.
>>>
>>>
>>>                                                   >
>>>                                                   > Consider hoisting
>>>             this inside the
>>>                                                  FunctionComparator
>>>             class? That
>>>                                                  class
>>>                                                   > should have a
>>> bunch of
>>>                                                  implementations of
>>>             comparisons
>>>             between
>>>                                                  various
>>>                                                   > different things,
>>>             which can pass
>>>                                                  down to other methods
>>>             in the
>>>                                                  same class.
>>>                                                  In new patch series
>>>             attached to this
>>>                                                  post, I have moved all
>>>             static
>>>                                                  methods into
>>>             FunctionComparator.
>>>
>>>
>>>                                                   > + // Replacement for
>>>
>>>             type::canLosslesslyBitCastTo, that
>>>                                                   > + // establish order
>>>             relation on
>>>                                                  this kind of
>>> properties.
>>>                                                   > + int
>>>
>>>             checkForLosslessBitcast(const Type
>>>                                                  *L, const Type *R);
>>>                                                   >
>>>                                                   > Type:: not type:: .
>>>             Please make
>>>                                                  this comment more
>>>             descriptive.
>>>                                                  Done.
>>>                                                  [new comment]
>>>                                                  Replacement for
>>>
>>>             Type::canLosslesslyBitCastTo, that
>>>
>>>                                                  establish order
>>>             relation on this
>>>                                                  kind of properties
>>>                                                  Returns 0, if L and R
>>>             types could be
>>>                                                  converted to each other
>>>                                                  without
>>>                                                  reinterpretation of
>>> bits.
>>>                                                  Otherwise method
>>>             returns -1 or 1,
>>>                                                  defining total ordering
>>>             between
>>>                                                  types in context of
>>>             lossless
>>>                                                  bitcastability trait.
>>>                                                  E.g.: if L is less than
>>>             R (result is
>>>                                                  -1), than every type
>>> that
>>>                                                  could be
>>>                                                  losslessly bitcasted to
>>>             L is less
>>>                                                  than R.
>>>                                                  [/new comment]
>>>
>>>
>>>                                                   >
>>>                                                   > + /// Replacement
>>>             for C1 ==
>>>                                                  getBitCast(C2, C1Ty)
>>>                                                   > + /// Its more
>>>             controllable, and
>>>                                                  supposed to be
>>> simpler and
>>>                                                  more
>>>                                                   > predictionable.
>>>                                                   > + /// As very
>>> important
>>>                                                  advantage: it allows to
>>>             introduce
>>>             order
>>>                                                  relation on
>>>                                                   > + /// constants set,
>>>             and thus use
>>>                                                  it as trait in
>>> refinement
>>>                                                  routines.
>>>                                                   >
>>>                                                   > "Its" --> "It's".
>>>                                                  "predictionable" -->
>>>             "predictable".
>>>                                                  And how is
>>>                                                  it more
>>>                                                   > predictable? I think
>>>             this comment
>>>                                                  would be better if it
>>>                                                  described the
>>>                                                   > function instead of
>>>             making
>>>                                                  comparisons between it
>>>             and other
>>>                                                  functions.
>>>                                                   > Something like,
>>> "Compare
>>>                                                  constants under a
>>>             system where
>>>             pointer
>>>                                                  to X and
>>>                                                   > pointer to Y are
>>>             considered
>>>                                                  equal" or whatever is
>>>             actually true
>>>                                                  here.
>>>                                                  Done.
>>>                                                  [new comment]
>>>
>>>                                                  Replacement for C1 ==
>>>             getBitCast(C2,
>>>                                                  C1Ty)
>>>                                                  Parses constants
>>>             contents, assuming
>>>                                                  that types are
>>> losslessly
>>>                                                  bitcasted between each
>>>             other. So
>>>                                                  actually it ignores
>>>             types and only
>>>                                                  compares bits from L
>>> and R.
>>>                                                  Returns 0, if L and R
>>>             has equivalent
>>>                                                  content.
>>>                                                  -1 or 1 if values are
>>>             different.
>>>                                                  Maintaining total
>>> ordering
>>>                                                  requires
>>>                                                  two values that
>>> indicates
>>>                                                  non-equivalence (L less
>>>             R, L
>>>             greater R).
>>>                                                  [/new comment]
>>>
>>>
>>>                                                   >
>>>                                                   > +enum ConstantType {
>>>                                                   > I'm not sure that
>>>             this buys you
>>>                                                  much. All the "isa"
>>>             tests can be
>>>                                                  broken
>>>                                                   > down into a
>>> switch on
>>>                                                  getValueID() with the
>>> one
>>>             exception of
>>>                                                  isNullValue().
>>>                                                  Done.
>>>
>>>
>>>                                                   > + assert(
>>>                                                   > +
>>>
>>>
>>> C1->getType()->______canLosslesslyBitCastTo(C2->______getType())
>>>                                                  &&
>>>                                                   > + "Pass is
>>> healthless.
>>>                                                  checkForLosslessBitcast
>>>             should be
>>>                                                  twin of "
>>>                                                   > +
>>>             "canLosslesslyBitCastTo method,
>>>                                                  except the case when
>>> the
>>>                                                  last one "
>>>                                                   > + "returns false,
>>>             the first one
>>>                                                  should return -1 or
>>> 1");
>>>                                                  ...
>>>
>>>                                                   > I think we can skip
>>>             the asserts
>>>                                                  here. They aren't
>>>             detecting a
>>>                                                  specific
>>>                                                   > bug, they're
>>>             checking whether the
>>>                                                  new code does a certain
>>>             task
>>>                                                  relative
>>>                                                   > to the old code.
>>>             Drop the old
>>>                                                  code, your new code is
>>>             the new
>>>                                                  sheriff in
>>>                                                   > town.
>>>                                                  Asserts has been
>>> removed.
>>>
>>>
>>>                                                   >
>>>                                                   > + DenseMap<const
>>>             Value*, int>
>>>                                                  sn_map1, sn_map2;
>>>                                                   >
>>>                                                   > What is sn short for
>>>             ("seen")?
>>>                                                  Why are there two of
>>> these?
>>>                                                  Serial number :-)
>>>
>>>                                                   >
>>>                                                   > +
>>>             std::pair<DenseMap<const Value
>>>                                                  *, int>::iterator,
>>> bool>
>>>                                                   > + LeftSN =
>>>
>>>             sn_map1.insert(std::make_pair(______V1,
>>>                                                  sn_map1.size())),
>>>                                                   > + RightSN =
>>>
>>>             sn_map2.insert(std::make_pair(______V2,
>>>                                                  sn_map2.size()));
>>>                                                   >
>>>                                                   > So I think I get it,
>>>             this is easy
>>>                                                  to reason about. You
>>> number
>>>                                                  each value
>>>                                                   > going down both
>>>             functions. But I
>>>                                                  think you can eliminate
>>>             one of
>>>                                                  these
>>>                                                   > maps because you
>>>             know that if the
>>>                                                  left and right were
>>> ever
>>>                                                  different
>>>                                                   > then we terminate
>>>             the comparison
>>>                                                  immediately. You can at
>>>             least
>>>                                                  share one
>>>                                                   > map with both V1 and
>>>             V2, but I
>>>                                                  think you can reduce
>>>             the number
>>>                                                  of map
>>>                                                   > insertions.
>>>                                                  Not sure. Consider that
>>>             in left you
>>>                                                  have:
>>>                                                  %A = alloca i32
>>>                                                  %B = alloca i32
>>>                                                  store i32 1, i32* %A
>>>
>>>                                                  And in right:
>>>                                                  %A = alloca i32
>>>                                                  %B = alloca i32
>>>                                                  store i32 1, i32* %B
>>>
>>>                                                  When we meet 'store'
>>>             instruction we
>>>                                                  need to check in which
>>>             order %A
>>>                                                  was allocated at left
>>>             and in which
>>>                                                  order %B was allocated
>>>             at right.
>>>                                                  So for each value in
>>>             function we
>>>                                                  have to assign serial
>>>             number. Then
>>>                                                  comparing to local
>>>             values from
>>>                                                  different functions we
>>>             can just
>>>                                                  check
>>>                                                  their serial numbers.
>>>                                                  Though, may be I missed
>>>             something?
>>>                                                  May be there is another
>>>             way to
>>>                                                  determine "who was
>>> first".
>>>
>>>
>>>                                                   > High-level
>>> question, are
>>>                                                  attributes ordered?
>>> Your
>>>             comparison is
>>>                                                  ordered.
>>>                                                   > So if I have
>>>             function F with
>>>                                                  string attributes
>>> "sse2",
>>>                                                  "gc=hemisphere"
>>>                                                   > and another function
>>>             G with
>>>                                                  string attributes
>>>             "gc=hemisphere",
>>>                                                  "sse2"
>>>                                                   > then they would be
>>>             considered
>>>
>>>
>>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>




More information about the llvm-commits mailing list