[cfe-dev] libc++ red-black tree: halving the branches
Philip Ashmore
contact at philipashmore.com
Sat Apr 7 05:03:35 PDT 2012
On 07/04/12 03:42, Sean Silva wrote:
> I've got a little bit for everybody:
>
> Howard: I completely understand your points. By no means did I intend
> this code in its current form to get into libc++, it was mostly just
> an experiment. I would not myself advocate this code change without
> clear evidence of a performance win.
>
> Chandler: perf is a really good tool for monitoring this stuff. There
> are a lot of low-level performance considerations, and measuring is
> the only way to really evaluate this kind of change. Also I think
> there are more "macro" optimizations to be made to the rebalancing
> code to reduce branches and branch mispredicts, bringing me to...
>
> Joerg: wow, that code is insane. the author really groks the red-black
> tree algorithms like nothing that I've ever seen before (not even
> CLRS). For example in the insert rebalancing, they put the Case 1
> recoloring in the loop by itself. Case 2 and 3 for are put *outside*
> the loop, which makes so much sense since Case 2 and 3 only execute
> exactly once (if at all), since the tree is balanced after that code
> runs. Also, their use of an array to hold the left and right nodes is
> beautiful! I though I was clever with the pointer-to-members, but the
> array version is simpler, and to boot they can use the result of
> comparisons like (father == grandpa->rb_right) directly to index which
> one to use.
>
> Yet another case in the field of computing where a more-complicated
> solution (pointer to members to determine left and right) has a
> better, simpler solution.
>
> Dave: from the point of view of code understandability, the current
> code is probably about as good as it gets, since it is essentially a
> mirror image in code in the CLRS algorithms book, which to my
> knowledge is where pretty much everybody learns how red-black trees
> work (and uses as a reference for the rb-tree algorithms).
> Unfortunately, the CLRS code is more theoretically oriented and their
> "programming language" for the algorithms lacks many of the more
> powerful capabilities that real computers have, such as dynamically
> choosing offsets for member access. Actually, my original inspiration
> for this was the "else" branch in RB-Insert-Fixup, which simply says
> "(same as then branch but with left and right reversed)".
>
> I have to say, that netbsd red-black tree has really piqued my
> interest. Two other red-black trees I have seen "in the wild"---the
> linux one and the nginx one---also follow the more to-the-book
> implementation like libc++'s; perhaps this is because the netbsd-like
> code has no significant advantage? Or maybe it is simply obscurity of
> this approach? idk, but I think I'm going to end up writing at least a
> couple red-black trees in order to test this.... fun...
If that's not enough fun, have you considered comparing your red-black
tree implementations with other tree implementations, say avl-trees?
I'm on Debian amd64 if you need to run your tests on different hardware.
I'll even give writing the avl-tree performance tests a whirl, once you
show me how you run the perf tests with red-black trees - a tar ball I
can unpack and "make check" with is all I need.
Yes, I've got a vested interest in this - my treedb project in
SourceForge uses avl-trees and I'd like to add red-black-trees in at
some stage :)
>
> On Fri, Apr 6, 2012 at 1:01 PM, Dave Abrahams <dave at boostpro.com
> <mailto:dave at boostpro.com>> wrote:
>
>
> on Thu Apr 05 2012, Howard Hinnant <hhinnant-AT-apple.com> wrote:
>
> > On Apr 5, 2012, at 2:12 PM, Dave Abrahams wrote:
> >
> >>
> >> on Thu Apr 05 2012, Howard Hinnant <hhinnant-AT-apple.com> wrote:
> >>
> >>> On Apr 5, 2012, at 12:00 PM, Dave Abrahams wrote:
> >>>
> >>>>
> >>>> on Wed Apr 04 2012, Howard Hinnant
> <hhinnant-2kanFRK1NckAvxtiuMwx3w-AT-public.gmane.org
> <http://hhinnant-2kanFRK1NckAvxtiuMwx3w-AT-public.gmane.org>> wrote:
> >>>>
> >>>>> I don't believe code size is going to be a driver on this
> one as the
> >>>>> potential savings tops out in the neighborhood of 30
> instructions
> >>>>> (which is small compared to the entire R/B tree container
> code). The
> >>>>> bottom line is going to be performance. Your rewrite is
> interesting,
> >>>>> but for me, significantly harder to read. So it will have
> to motivate
> >>>>> with significant performance wins. And if it does, I think
> >>>>> __tree_remove would benefit from the same treatment.
> However if we're
> >>>>> looking at zero or marginal performance wins, then I favor
> the current
> >>>>> code: it is easier to reason about and maintain.
> >>>>
> >>>> Really, even with the duplication of left/right logic? Seems
> to me that
> >>>> if the code size reduction isn't a win and there are
> performance costs
> >>>> you could still improve maintainability while maintaining the
> current
> >>>> generated code structure by passing the member pointers as
> template arguments to a
> >>>> helper function template.
> >>>
> >>> The libc++ code base is open source and I am happy to review
> patches.
> >>
> >> Awww... I'm not trying to get you to change anything. My
> question is
> >> whether you think it's better for maintainability (in this
> case?) to
> >> write out two totally symmetric branches with duplicated logic
> than it
> >> is to eliminate that duplication, and if so, why. I could also
> take
> >> your response to mean "I really don't have time to deal with
> questions
> >> like that," which would be fine, too.
> >
> > To properly answer your question I would have to write an
> alternative
> > version of the code in question. So I'm asking you to pose your
> > question in terms of alternative code I can study, instead of taking
> > on the job myself of crafting that alternative.
>
> Well, I wasn't looking for an answer that's any more "proper" than
> your
> statement above that the current code is easier to reason about and
> maintain. It's OK with me if you don't feel like sticking your
> neck out
> further on this one without something more solid to look at, but while
> that project does sound like a lot of fun, it was a mostly-idle
> question
> and I don't have time to take it on at the moment.
>
> Cheers,
>
> --
> Dave Abrahams
> BoostPro Computing
> http://www.boostpro.com
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu <mailto:cfe-dev at cs.uiuc.edu>
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
More information about the cfe-dev
mailing list