[llvm] r222935 - Simplify some more ownership using forward_list<T> rather than vector<unique_ptr<T>>

Fri Feb 27 13:47:38 PST 2015

On Mon, Feb 9, 2015 at 4:52 PM, Duncan P. N. Exon Smith <
dexonsmith at apple.com> wrote:

>
> > On 2015-Feb-09, at 15:21, David Blaikie <dblaikie at gmail.com> wrote:
> >
> >
> >
> > On Mon, Feb 9, 2015 at 2:27 PM, David Blaikie <dblaikie at gmail.com>
> wrote:
> >
> >
> > On Thu, Feb 5, 2015 at 11:50 AM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
> >
> > > On 2015-Feb-04, at 22:21, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
> > >
> > >>
> > >> On 2015 Feb 2, at 11:38, David Blaikie <dblaikie at gmail.com> wrote:
> > >>
> > >>
> > >>
> > >> On Mon, Feb 2, 2015 at 11:28 AM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
> > >>
> > >>> On 2015-Feb-02, at 11:00, David Blaikie <dblaikie at gmail.com> wrote:
> > >>>
> > >>>
> > >>>
> > >>> On Mon, Jan 19, 2015 at 1:00 PM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
> > >>>
> > >>>> On 2014-Dec-22, at 14:28, David Blaikie <dblaikie at gmail.com> wrote:
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Fri, Dec 19, 2014 at 4:07 PM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
> > >>>>
> > >>>>> On 2014-Dec-19, at 16:02, David Blaikie <dblaikie at gmail.com>
> wrote:
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> On Fri, Dec 19, 2014 at 2:22 PM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
> > >>>>>
> > >>>>>> On 2014-Dec-19, at 14:11, David Blaikie <dblaikie at gmail.com>
> wrote:
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> On Wed, Dec 3, 2014 at 3:57 PM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
> > >>>>>> I looked into test/CodeGen/X86/compact-unwind.ll.  llvm-mc was
> giving
> > >>>>>> different results.  Given that you changed `tablegen`, I frankly
> gave
> > >>>>>> up on figuring out exactly why.
> > >>>>>>
> > >>>>>> I went back and reviewed the patch for changed semantics, and
> took a
> > >>>>>> shot in the dark.  The `sort()` at the end is different from the
> > >>>>>> old `sort()`, because items have been front-loaded instead of
> back-
> > >>>>>> loaded.
> > >>>>>>
> > >>>>>> The fix is to sort it backwards and then reverse the list.  This
> passes
> > >>>>>> `check` for me.
> > >>>>>>
> > >>>>>> Any preferences on this approach, versus using std::list (paying
> an extra pointer per element) and pushing onto the end instead? (hmm, or
> maybe we can keep an iterator to the tail end of the list and just use that
> as an insert hint while still using std::forward_list...  - not sure if
> that's worth it)
> > >>>>>
> > >>>>> I'm not sure which approach makes sense here.  How memory
> constrained
> > >>>>> does `tablegen` get?  `std::list<>` seems easiest if the extra
> pointer
> > >>>>> doesn't matter.
> > >>>>>
> > >>>>> Welp... figured out why I wasn't seeing those failures you saw.
> std::forward_list<>::sort in libstdc++ (even in 4.9) is unstable (& in fact
> exactly inverted - which was exactly what was required). So your fix
> actually caused exactly the same failures for me as you'd seen from my
> original patch...
> > >>>>>
> > >>>>> In any case, that means we can't rely on its behavior in LLVM and
> I'll just use std::list<> instead. Lame.
> > >>>>>
> > >>>>
> > >>>> That's a shame :(.
> > >>>>
> > >>>> 'tis rather.
> > >>>>
> > >>>> Now I have another mystery, in case anyone's interested in having a
> guess.
> > >>>>
> > >>>> This change:
> > >>>>
> > >>>> $ git diff
> > >>>> diff --git utils/TableGen/AsmMatcherEmitter.cpp
> utils/TableGen/AsmMatcherEmitter.cpp
> > >>>> index 3663de7..cb1f178 100644
> > >>>> --- utils/TableGen/AsmMatcherEmitter.cpp
> > >>>> +++ utils/TableGen/AsmMatcherEmitter.cpp
> > >>>> @@ -613,7 +613,7 @@ public:
> > >>>>   std::forward_list<ClassInfo> Classes;
> > >>>>
> > >>>>   /// The information on the matchables to match.
> > >>>> -  std::vector<std::unique_ptr<MatchableInfo>> Matchables;
> > >>>> +  std::list<std::unique_ptr<MatchableInfo>> Matchables;
> > >>>>
> > >>>>   /// Info for custom matching operands by user defined methods.
> > >>>>   std::vector<OperandMatchEntry> OperandMatchInfo;
> > >>>> @@ -1674,7 +1674,7 @@ static unsigned getConverterOperandID(const
> std::string &Name,
> > >>>>
> > >>>>
> > >>>> static void emitConvertFuncs(CodeGenTarget &Target, StringRef
> ClassName,
> > >>>> -
>  std::vector<std::unique_ptr<MatchableInfo>> &Infos,
> > >>>> +
>  std::list<std::unique_ptr<MatchableInfo>> &Infos,
> > >>>>                              raw_ostream &OS) {
> > >>>>   SetVector<std::string> OperandConversionKinds;
> > >>>>   SetVector<std::string> InstructionConversionKinds;
> > >>>> @@ -2593,10 +2593,9 @@ void AsmMatcherEmitter::run(raw_ostream &OS)
> {
> > >>>>   // Sort the instruction table using the partial order on classes.
> We use
> > >>>>   // stable_sort to ensure that ambiguous instructions are still
> > >>>>   // deterministically ordered.
> > >>>> -  std::stable_sort(Info.Matchables.begin(), Info.Matchables.end(),
> > >>>> -                   [](const std::unique_ptr<MatchableInfo> &a,
> > >>>> -                      const std::unique_ptr<MatchableInfo> &b){
> > >>>> -                     return *a < *b;});
> > >>>> +  Info.Matchables.sort(
> > >>>> +      [](const std::unique_ptr<MatchableInfo> &a,
> > >>>> +         const std::unique_ptr<MatchableInfo> &b) { return *a <
> *b; });
> > >>>>
> > >>>>   DEBUG_WITH_TYPE("instruction_info", {
> > >>>>       for (const auto &MI : Info.Matchables)
> > >>>>
> > >>>>
> > >>>> Causes observable differences (no test failures, but non-empty
> diffs) in the .inc files in an LLVM build.
> > >>>>
> > >>>> Anyone want to guess what's going on here? Maybe some partial
> instability... (unlike std::forward_list::sort's completely inverted
> instability)
> > >>>
> > >>> I can't think of any other explanation, unless there's UB lurking
> > >>> somewhere.
> > >>>
> > >>> Sort of.
> > >>>
> > >>> I have a theory with some evidence:
> > >>>
> > >>> If I put this code:
> > >>>
> > >>>  for (auto I = Info.Matchables.begin(), E = Info.Matchables.end(); I
> != E; ++I) {
> > >>>    for (auto J = I; J != E; ++J) {
> > >>>      bool greater = **J < **I;
> > >>>      assert(!greater);
> > >>>    }
> > >>>  }
> > >>>
> > >>> immediately after the call to std::stable_sort, it does actually
> fail (even in the existing code using std::vector). (wouldn't mind someone
> verifying this, just for a sanity check - that I got the comparisons around
> the right way and that the assertion failure reproduces for them too)
> > >>>
> > >>> So it seems that the MatchableInfo operator< might not provide a
> strict weak ordering.
> > >>>
> > >>> Do you/does anyone know an easy way for me to find a specific
> contradiction in the ordering so I can understand/fix it?
> > >>>
> > >>> The assertion only gives me the elements that, somehow, somewhere
> along the way, std::stable_sort thought were <= but they're actually >. I
> want to find the set of elements (preferably the minimal set of elements)
> that indirectly show they are <= so I can understand which part of the
> comparison is buggy.
> > >>>
> > >
> > > Okay, so back to your previous question: it sounds like you're looking
> > > for the shortest cycle in a directed graph (where the nodes are
> > > `MatchableInfo*` with edges `L -> R` when `MatchableInfo::Less(L, R)`).
> >
> > The `findShortestCycle()` code in the attached patch should get you what
> > you want (although as you can see I've hardly tested it).  Get the
> > shortest cycle like this:
> >
> >     std::vector<unsigned> Cycle =
> >         findShortestCycle(Matchables.size(), [&](unsigned T, unsigned H)
> {
> >           return MatchableInfo::Less()(Matchables[T], Matchables[H]);
> >         });
> >
> > Neat - thanks!
> >
> > Though, as it turns out, within the first attempt or two I couldn't get
> an answer from this (maybe the process died before it found the answer).
> >
> > But since I knew the result was meant to be sorted, I just kept track of
> the shortest subsequence with one of the contradictions (Y < X) and, as it
> happens, I found a minimal example: X < X + 1 < X + 2 && X + 2 < X.
> >
> > I'm trying to understand the comparison further now:
> >
> > X:
> > Mnemonic: vshl
> > AsmOperands.size(): 4
> > AsmOperands[0].Class: MCK_CondCode
> > AsmOperands[1].Class: MCK__DOT_u64
> >
> > X + 1:
> > Mnemonic: vshl
> > AsmOperands.size(): 4
> > AsmOperands[0].Class: MCK_CondCode
> > AsmOperands[1].Class: MCK__DOT_i64
> >
> > X + 2:
> > Mnemonic: vshl
> > AsmOperands.size(): 4
> > AsmOperands[0].Class.MCK_CondCode
> > AsmOperands[1].Class.MCK_s8
> >
> > (MCK__DOT_u64 < MCK__DOT_i64)
> > (MCK_s8 < MCK__DOT_u64)
> > (MCK__DOT_i64 < MCK__DOT_s8)
> >
> > Next, trying to understand the Class ordering cycle here...
> >
> > u64 < i64: u64 is a subset of i64
> > i64 < s8: i64 and s8 are not subsets of each other, so order by
> ValueName ('.i64' < '.s8')
> > s8 < u64: s8 and u64 are not subsets of each other, so order by
> ValueName ('.u64' !< '.s8')
> >
> > This is about where I'm stuck again. Don't really know enough about the
> tblgen requirements, or the mathematics of ordering constraints to see an
> obvious answer to 'how to provide a sound ordering' here...
> >
> > - David
> >
>
> This is single inheritance, right?  `SuperClasses.front()` is the
> only immediate super class?  In that case, `compare()` should be what
> you want:
>
>     int compareImpl(ClassInfo *L, ClassInfo *R) {
>       return L->ValueName.compare(R->ValueName);
>     }
>     int compare(ClassInfo *L, ClassInfo *R) {
>       if (L == R)
>         return 0;
>
>       typedef SmallVector<ClassInfo *, 8> VecTy;
>       auto visit = [](VecTy &V, ClassInfo *CI) {
>         V.push_back(CI);
>         while (!V.back().SuperClasses.empty()) {
>           V.push_back(V.back().SuperClasses.front());
>       };
>       VecTy LV;
>       VecTy RV;
>       visit(LV, L);
>       visit(RV, R);
>       auto pop = [](VecTy &V) {
>         auto *T = V.back();
>         V.pop_back();
>         return T;
>       };
>       while (!LV.empty() && !RV.empty()) {
>         auto *LI = pop(LV);
>         auto *RI = pop(RV);
>         if (LI != RI)
>           return compareImpl(LI, RI);
>       }
>       if (!RV.empty())
>         return -1;
>       if (!LV.empty())
>         return 1;
>       return compareImpl(L, R);
>     }
>
> This structurally compares two inheritance lists from the root, and
> compares the first difference based on the names.
>
> (If indeed this is single inheritance) I think the right thing to do
> is to add all the non hierarchical logic to `compareImpl()`, and stop
> calling `isRelatedTo()` or `isSubsetOf()` entirely.
>
> You can avoid unnecessary work by hacking visit:
>
>       auto visit = [](VecTy &V, ClassInfo *CI, ClassInfo *Other) {
>         V.push_back(CI);
>         while (!V.back().SuperClasses.empty()) {
>           auto *Super = V.back().SuperClasses.front();
>           if (Super == Other)
>             return true;
>           V.push_back(Super);
>         }
>         return false;
>       };
>       VecTy LV;
>       VecTy RV;
>       if (visit(LV, L, R)) // R is a superclass of L.
>         return 1;
>       if (visit(RV, R, L)) // L is a superclass of R.
>         return -1;
>
> This version returns early if L or R is the other's superclass.  You
> can change the final logic to:
>
>       assert(RV.empty());
>       assert(LV.empty());
>       return compareImpl(L, R);
>
> ======
>
> It seems harder if this is multiple inheritance.  In that case, the
> following should give *a* total ordering:
>
>     int compare(ClassInfo *L, ClassInfo *R) {
>       unsigned LE = LV.size();
>       unsigned RE = RV.size();
>       for (unsigned I = 0; I != LE && I != RE; ++I)
>         if (L->SuperClasses[I] != R->SuperClasses[I])
>           return compare(L, R);
>       if (LE < RE)
>         return -1;
>       if (LE > RE)
>         return 1;
>       return compareImpl(L, R);
>     }
>
> However, it doesn't guarantee that superclasses precede subclasses.
> And if you run `isSubsetOf()` checks externally to hack that in, I think
> you'll hit the same types of problems you already have.
>
> To fix it, you need to compare at the level closest to the root that has
> a difference.  Something like:
>
>     int visit(SmallVector<ClassInfo *, 8> &V,
>               DenseMap<ClassInfo *, int> &Levels,
>               ClassInfo *CI) {
>       if (int Level = Levels.lookup(CI))
>         return Level;
>       int SuperLevel = 0;
>       for (auto *Super : CI->SuperClasses)
>         SuperLevel = std::max(SuperLevel, visit(V, Levels, Super));
>       V.emplace_back(CI);
>       return Levels[CI] = SuperLevel + 1;
>     }
>
> gets you the right information.
>
> I *think* if you `stable_sort()` LV and RV by Levels, you can use similar
> logic to the previous `compare()` code.  I haven't entirely thought this
> through though.
>

I really appreciate all your help on this Duncan - but I think this is
getting a little out of my depth (& honestly, the original motivation - to
change from vector<unique_ptr<T>> to list<T> is questionable given that
it'll add an extra pointer-per-node cost to this data structure)

Unless you/someone thinks this sort issue is something that'll bite someone
one day (I'm not sure if the inconsistency of the sort function is benign
or whether it could produce a wrong ordering?) in which case I'll push on
it for a while longer. If not, maybe it's best to at least leave a
comment/FIXME/warning that this is slightly broken?

- David

>
> >
> >
> >
> >
> > >
> > >>> Any ideas?
> > >>
> > >> I just had a look at the code.  `MatchableInfo::operator<()` depends
> > >> eventually on `ClassInfo::operator<()`, which includes this logic:
> > >>
> > >>      // This class precedes the RHS if it is a proper subset of the
> RHS.
> > >>      if (isSubsetOf(RHS))
> > >>        return true;
> > >>
> > >> But the comment doesn't match the implementation of `isSubsetOf()`,
> which
> > >> returns `true` even on equal sets.  Is the "equal set" case possible,
> > >> despite the pointer comparison (`this == &RHS`) above?
> > >>
> > >> Was worth a shot, but I added an "assert(!RHS.isSubsetOf(*this))" in
> the if before the return and it didn't fire, so that may not be it.
> > >>
> > >> I don't think there's an immediate asymmetry (where something returns
> true for both A < B and B < A) but something indirect (A < B, B < C but !(A
> < C) - though it could have multiple intermediates... )
> > >>
> > >>
> > >> I audited the rest of the code (but just quickly) and nothing else
> popped
> > >> out.
> > >
> > >
> > > _______________________________________________
> > > llvm-commits mailing list
> > > llvm-commits at cs.uiuc.edu
> > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150227/5f5e0582/attachment.html>