[llvm] r222935 - Simplify some more ownership using forward_list<T> rather than vector<unique_ptr<T>>

Duncan P. N. Exon Smith dexonsmith at apple.com
Mon Feb 9 16:52:50 PST 2015


> On 2015-Feb-09, at 15:21, David Blaikie <dblaikie at gmail.com> wrote:
> 
> 
> 
> On Mon, Feb 9, 2015 at 2:27 PM, David Blaikie <dblaikie at gmail.com> wrote:
> 
> 
> On Thu, Feb 5, 2015 at 11:50 AM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> 
> > On 2015-Feb-04, at 22:21, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> >
> >>
> >> On 2015 Feb 2, at 11:38, David Blaikie <dblaikie at gmail.com> wrote:
> >>
> >>
> >>
> >> On Mon, Feb 2, 2015 at 11:28 AM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> >>
> >>> On 2015-Feb-02, at 11:00, David Blaikie <dblaikie at gmail.com> wrote:
> >>>
> >>>
> >>>
> >>> On Mon, Jan 19, 2015 at 1:00 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> >>>
> >>>> On 2014-Dec-22, at 14:28, David Blaikie <dblaikie at gmail.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On Fri, Dec 19, 2014 at 4:07 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> >>>>
> >>>>> On 2014-Dec-19, at 16:02, David Blaikie <dblaikie at gmail.com> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Fri, Dec 19, 2014 at 2:22 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> >>>>>
> >>>>>> On 2014-Dec-19, at 14:11, David Blaikie <dblaikie at gmail.com> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Wed, Dec 3, 2014 at 3:57 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> >>>>>> I looked into test/CodeGen/X86/compact-unwind.ll.  llvm-mc was giving
> >>>>>> different results.  Given that you changed `tablegen`, I frankly gave
> >>>>>> up on figuring out exactly why.
> >>>>>>
> >>>>>> I went back and reviewed the patch for changed semantics, and took a
> >>>>>> shot in the dark.  The `sort()` at the end is different from the
> >>>>>> old `sort()`, because items have been front-loaded instead of back-
> >>>>>> loaded.
> >>>>>>
> >>>>>> The fix is to sort it backwards and then reverse the list.  This passes
> >>>>>> `check` for me.
> >>>>>>
> >>>>>> Any preferences on this approach, versus using std::list (paying an extra pointer per element) and pushing onto the end instead? (hmm, or maybe we can keep an iterator to the tail end of the list and just use that as an insert hint while still using std::forward_list...  - not sure if that's worth it)
> >>>>>
> >>>>> I'm not sure which approach makes sense here.  How memory constrained
> >>>>> does `tablegen` get?  `std::list<>` seems easiest if the extra pointer
> >>>>> doesn't matter.
> >>>>>
> >>>>> Welp... figured out why I wasn't seeing those failures you saw. std::forward_list<>::sort in libstdc++ (even in 4.9) is unstable (& in fact exactly inverted - which was exactly what was required). So your fix actually caused exactly the same failures for me as you'd seen from my original patch...
> >>>>>
> >>>>> In any case, that means we can't rely on its behavior in LLVM and I'll just use std::list<> instead. Lame.
> >>>>>
> >>>>
> >>>> That's a shame :(.
> >>>>
> >>>> 'tis rather.
> >>>>
> >>>> Now I have another mystery, in case anyone's interested in having a guess.
> >>>>
> >>>> This change:
> >>>>
> >>>> $ git diff
> >>>> diff --git utils/TableGen/AsmMatcherEmitter.cpp utils/TableGen/AsmMatcherEmitter.cpp
> >>>> index 3663de7..cb1f178 100644
> >>>> --- utils/TableGen/AsmMatcherEmitter.cpp
> >>>> +++ utils/TableGen/AsmMatcherEmitter.cpp
> >>>> @@ -613,7 +613,7 @@ public:
> >>>>   std::forward_list<ClassInfo> Classes;
> >>>>
> >>>>   /// The information on the matchables to match.
> >>>> -  std::vector<std::unique_ptr<MatchableInfo>> Matchables;
> >>>> +  std::list<std::unique_ptr<MatchableInfo>> Matchables;
> >>>>
> >>>>   /// Info for custom matching operands by user defined methods.
> >>>>   std::vector<OperandMatchEntry> OperandMatchInfo;
> >>>> @@ -1674,7 +1674,7 @@ static unsigned getConverterOperandID(const std::string &Name,
> >>>>
> >>>>
> >>>> static void emitConvertFuncs(CodeGenTarget &Target, StringRef ClassName,
> >>>> -                             std::vector<std::unique_ptr<MatchableInfo>> &Infos,
> >>>> +                             std::list<std::unique_ptr<MatchableInfo>> &Infos,
> >>>>                              raw_ostream &OS) {
> >>>>   SetVector<std::string> OperandConversionKinds;
> >>>>   SetVector<std::string> InstructionConversionKinds;
> >>>> @@ -2593,10 +2593,9 @@ void AsmMatcherEmitter::run(raw_ostream &OS) {
> >>>>   // Sort the instruction table using the partial order on classes. We use
> >>>>   // stable_sort to ensure that ambiguous instructions are still
> >>>>   // deterministically ordered.
> >>>> -  std::stable_sort(Info.Matchables.begin(), Info.Matchables.end(),
> >>>> -                   [](const std::unique_ptr<MatchableInfo> &a,
> >>>> -                      const std::unique_ptr<MatchableInfo> &b){
> >>>> -                     return *a < *b;});
> >>>> +  Info.Matchables.sort(
> >>>> +      [](const std::unique_ptr<MatchableInfo> &a,
> >>>> +         const std::unique_ptr<MatchableInfo> &b) { return *a < *b; });
> >>>>
> >>>>   DEBUG_WITH_TYPE("instruction_info", {
> >>>>       for (const auto &MI : Info.Matchables)
> >>>>
> >>>>
> >>>> Causes observable differences (no test failures, but non-empty diffs) in the .inc files in an LLVM build.
> >>>>
> >>>> Anyone want to guess what's going on here? Maybe some partial instability... (unlike std::forward_list::sort's completely inverted instability)
> >>>
> >>> I can't think of any other explanation, unless there's UB lurking
> >>> somewhere.
> >>>
> >>> Sort of.
> >>>
> >>> I have a theory with some evidence:
> >>>
> >>> If I put this code:
> >>>
> >>>  for (auto I = Info.Matchables.begin(), E = Info.Matchables.end(); I != E; ++I) {
> >>>    for (auto J = I; J != E; ++J) {
> >>>      bool greater = **J < **I;
> >>>      assert(!greater);
> >>>    }
> >>>  }
> >>>
> >>> immediately after the call to std::stable_sort, it does actually fail (even in the existing code using std::vector). (wouldn't mind someone verifying this, just for a sanity check - that I got the comparisons around the right way and that the assertion failure reproduces for them too)
> >>>
> >>> So it seems that the MatchableInfo operator< might not provide a strict weak ordering.
> >>>
> >>> Do you/does anyone know an easy way for me to find a specific contradiction in the ordering so I can understand/fix it?
> >>>
> >>> The assertion only gives me the elements that, somehow, somewhere along the way, std::stable_sort thought were <= but they're actually >. I want to find the set of elements (preferably the minimal set of elements) that indirectly show they are <= so I can understand which part of the comparison is buggy.
> >>>
> >
> > Okay, so back to your previous question: it sounds like you're looking
> > for the shortest cycle in a directed graph (where the nodes are
> > `MatchableInfo*` with edges `L -> R` when `MatchableInfo::Less(L, R)`).
> 
> The `findShortestCycle()` code in the attached patch should get you what
> you want (although as you can see I've hardly tested it).  Get the
> shortest cycle like this:
> 
>     std::vector<unsigned> Cycle =
>         findShortestCycle(Matchables.size(), [&](unsigned T, unsigned H) {
>           return MatchableInfo::Less()(Matchables[T], Matchables[H]);
>         });
> 
> Neat - thanks!
> 
> Though, as it turns out, within the first attempt or two I couldn't get an answer from this (maybe the process died before it found the answer).
> 
> But since I knew the result was meant to be sorted, I just kept track of the shortest subsequence with one of the contradictions (Y < X) and, as it happens, I found a minimal example: X < X + 1 < X + 2 && X + 2 < X.
> 
> I'm trying to understand the comparison further now:
> 
> X:
> Mnemonic: vshl
> AsmOperands.size(): 4
> AsmOperands[0].Class: MCK_CondCode
> AsmOperands[1].Class: MCK__DOT_u64
> 
> X + 1:
> Mnemonic: vshl
> AsmOperands.size(): 4
> AsmOperands[0].Class: MCK_CondCode
> AsmOperands[1].Class: MCK__DOT_i64
> 
> X + 2:
> Mnemonic: vshl
> AsmOperands.size(): 4
> AsmOperands[0].Class.MCK_CondCode
> AsmOperands[1].Class.MCK_s8
> 
> (MCK__DOT_u64 < MCK__DOT_i64)
> (MCK_s8 < MCK__DOT_u64)
> (MCK__DOT_i64 < MCK__DOT_s8)
> 
> Next, trying to understand the Class ordering cycle here... 
> 
> u64 < i64: u64 is a subset of i64
> i64 < s8: i64 and s8 are not subsets of each other, so order by ValueName ('.i64' < '.s8')
> s8 < u64: s8 and u64 are not subsets of each other, so order by ValueName ('.u64' !< '.s8')
> 
> This is about where I'm stuck again. Don't really know enough about the tblgen requirements, or the mathematics of ordering constraints to see an obvious answer to 'how to provide a sound ordering' here... 
> 
> - David
>  

This is single inheritance, right?  `SuperClasses.front()` is the
only immediate super class?  In that case, `compare()` should be what
you want:

    int compareImpl(ClassInfo *L, ClassInfo *R) {
      return L->ValueName.compare(R->ValueName);
    }
    int compare(ClassInfo *L, ClassInfo *R) {
      if (L == R)
        return 0;

      typedef SmallVector<ClassInfo *, 8> VecTy;
      auto visit = [](VecTy &V, ClassInfo *CI) {
        V.push_back(CI);
        while (!V.back().SuperClasses.empty()) {
          V.push_back(V.back().SuperClasses.front());
      };
      VecTy LV;
      VecTy RV;
      visit(LV, L);
      visit(RV, R);
      auto pop = [](VecTy &V) {
        auto *T = V.back();
        V.pop_back();
        return T;
      };
      while (!LV.empty() && !RV.empty()) {
        auto *LI = pop(LV);
        auto *RI = pop(RV);
        if (LI != RI)
          return compareImpl(LI, RI);
      }
      if (!RV.empty())
        return -1;
      if (!LV.empty())
        return 1;
      return compareImpl(L, R);
    }

This structurally compares two inheritance lists from the root, and
compares the first difference based on the names.

(If indeed this is single inheritance) I think the right thing to do
is to add all the non hierarchical logic to `compareImpl()`, and stop
calling `isRelatedTo()` or `isSubsetOf()` entirely.

You can avoid unnecessary work by hacking visit:

      auto visit = [](VecTy &V, ClassInfo *CI, ClassInfo *Other) {
        V.push_back(CI);
        while (!V.back().SuperClasses.empty()) {
          auto *Super = V.back().SuperClasses.front();
          if (Super == Other)
            return true;
          V.push_back(Super);
        }
        return false;
      };
      VecTy LV;
      VecTy RV;
      if (visit(LV, L, R)) // R is a superclass of L.
        return 1;
      if (visit(RV, R, L)) // L is a superclass of R.
        return -1;

This version returns early if L or R is the other's superclass.  You
can change the final logic to:

      assert(RV.empty());
      assert(LV.empty());
      return compareImpl(L, R);

======

It seems harder if this is multiple inheritance.  In that case, the
following should give *a* total ordering:

    int compare(ClassInfo *L, ClassInfo *R) {
      unsigned LE = LV.size();
      unsigned RE = RV.size();
      for (unsigned I = 0; I != LE && I != RE; ++I)
        if (L->SuperClasses[I] != R->SuperClasses[I])
          return compare(L, R);
      if (LE < RE)
        return -1;
      if (LE > RE)
        return 1;
      return compareImpl(L, R);
    }

However, it doesn't guarantee that superclasses precede subclasses.
And if you run `isSubsetOf()` checks externally to hack that in, I think
you'll hit the same types of problems you already have.

To fix it, you need to compare at the level closest to the root that has
a difference.  Something like:

    int visit(SmallVector<ClassInfo *, 8> &V,
              DenseMap<ClassInfo *, int> &Levels,
              ClassInfo *CI) {
      if (int Level = Levels.lookup(CI))
        return Level;
      int SuperLevel = 0;
      for (auto *Super : CI->SuperClasses)
        SuperLevel = std::max(SuperLevel, visit(V, Levels, Super));
      V.emplace_back(CI);
      return Levels[CI] = SuperLevel + 1;
    }

gets you the right information.

I *think* if you `stable_sort()` LV and RV by Levels, you can use similar
logic to the previous `compare()` code.  I haven't entirely thought this
through though.

>  
> 
> 
> 
> >
> >>> Any ideas?
> >>
> >> I just had a look at the code.  `MatchableInfo::operator<()` depends
> >> eventually on `ClassInfo::operator<()`, which includes this logic:
> >>
> >>      // This class precedes the RHS if it is a proper subset of the RHS.
> >>      if (isSubsetOf(RHS))
> >>        return true;
> >>
> >> But the comment doesn't match the implementation of `isSubsetOf()`, which
> >> returns `true` even on equal sets.  Is the "equal set" case possible,
> >> despite the pointer comparison (`this == &RHS`) above?
> >>
> >> Was worth a shot, but I added an "assert(!RHS.isSubsetOf(*this))" in the if before the return and it didn't fire, so that may not be it.
> >>
> >> I don't think there's an immediate asymmetry (where something returns true for both A < B and B < A) but something indirect (A < B, B < C but !(A < C) - though it could have multiple intermediates... )
> >>
> >>
> >> I audited the rest of the code (but just quickly) and nothing else popped
> >> out.
> >
> >
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits





More information about the llvm-commits mailing list