[cfe-dev] libc++ red-black tree: halving the branches

Sat Apr 7 05:03:35 PDT 2012

On 07/04/12 03:42, Sean Silva wrote:
> I've got a little bit for everybody:
>
> Howard: I completely understand your points. By no means did I intend 
> this code in its current form to get into libc++, it was mostly just 
> an experiment. I would not myself advocate this code change without 
> clear evidence of a performance win.
>
> Chandler: perf is a really good tool for monitoring this stuff. There 
> are a lot of low-level performance considerations, and measuring is 
> the only way to really evaluate this kind of change. Also I think 
> there are more "macro" optimizations to be made to the rebalancing 
> code to reduce branches and branch mispredicts, bringing me to...
>
> Joerg: wow, that code is insane. the author really groks the red-black 
> tree algorithms like nothing that I've ever seen before (not even 
> CLRS). For example in the insert rebalancing, they put the Case 1 
> recoloring in the loop by itself. Case 2 and 3 for are put *outside* 
> the loop, which makes so much sense since Case 2 and 3 only execute 
> exactly once (if at all), since the tree is balanced after that code 
> runs. Also, their use of an array to hold the left and right nodes is 
> beautiful! I though I was clever with the pointer-to-members, but the 
> array version is simpler, and to boot they can use the result of 
> comparisons like (father == grandpa->rb_right) directly to index which 
> one to use.
>
> Yet another case in the field of computing where a more-complicated 
> solution (pointer to members to determine left and right) has a 
> better, simpler solution.
>
> Dave: from the point of view of code understandability, the current 
> code is probably about as good as it gets, since it is essentially a 
> mirror image in code in the CLRS algorithms book, which to my 
> knowledge is where pretty much everybody learns how red-black trees 
> work (and uses as a reference for the rb-tree algorithms). 
> Unfortunately, the CLRS code is more theoretically oriented and their 
> "programming language" for the algorithms lacks many of the more 
> powerful capabilities that real computers have, such as dynamically 
> choosing offsets for member access. Actually, my original inspiration 
> for this was the "else" branch in RB-Insert-Fixup, which simply says 
> "(same as then branch but with left and right reversed)".
>
> I have to say, that netbsd red-black tree has really piqued my 
> interest. Two other red-black trees I have seen "in the wild"---the 
> linux one and the nginx one---also follow the more to-the-book 
> implementation like libc++'s; perhaps this is because the netbsd-like 
> code has no significant advantage? Or maybe it is simply obscurity of 
> this approach? idk, but I think I'm going to end up writing at least a 
> couple red-black trees in order to test this.... fun...
If that's not enough fun, have you considered comparing your red-black 
tree implementations with other tree implementations, say avl-trees?

I'm on Debian amd64 if you need to run your tests on different hardware.

I'll even give writing the avl-tree performance tests a whirl, once you 
show me how you run the perf tests with red-black trees - a tar ball I 
can unpack and "make check" with is all I need.

Yes, I've got a vested interest in this - my treedb project in 
SourceForge uses avl-trees and I'd like to add red-black-trees in at 
some stage :)
>
> On Fri, Apr 6, 2012 at 1:01 PM, Dave Abrahams <dave at boostpro.com 
> <mailto:dave at boostpro.com>> wrote:
>
>
>     on Thu Apr 05 2012, Howard Hinnant <hhinnant-AT-apple.com> wrote:
>
>     > On Apr 5, 2012, at 2:12 PM, Dave Abrahams wrote:
>     >
>     >>
>     >> on Thu Apr 05 2012, Howard Hinnant <hhinnant-AT-apple.com> wrote:
>     >>
>     >>> On Apr 5, 2012, at 12:00 PM, Dave Abrahams wrote:
>     >>>
>     >>>>
>     >>>> on Wed Apr 04 2012, Howard Hinnant
>     <hhinnant-2kanFRK1NckAvxtiuMwx3w-AT-public.gmane.org
>     <http://hhinnant-2kanFRK1NckAvxtiuMwx3w-AT-public.gmane.org>> wrote:
>     >>>>
>     >>>>> I don't believe code size is going to be a driver on this
>     one as the
>     >>>>> potential savings tops out in the neighborhood of 30
>     instructions
>     >>>>> (which is small compared to the entire R/B tree container
>     code).  The
>     >>>>> bottom line is going to be performance.  Your rewrite is
>     interesting,
>     >>>>> but for me, significantly harder to read.  So it will have
>     to motivate
>     >>>>> with significant performance wins.  And if it does, I think
>     >>>>> __tree_remove would benefit from the same treatment.
>      However if we're
>     >>>>> looking at zero or marginal performance wins, then I favor
>     the current
>     >>>>> code: it is easier to reason about and maintain.
>     >>>>
>     >>>> Really, even with the duplication of left/right logic?  Seems
>     to me that
>     >>>> if the code size reduction isn't a win and there are
>     performance costs
>     >>>> you could still improve maintainability while maintaining the
>     current
>     >>>> generated code structure by passing the member pointers as
>     template arguments to a
>     >>>> helper function template.
>     >>>
>     >>> The libc++ code base is open source and I am happy to review
>     patches.
>     >>
>     >> Awww... I'm not trying to get you to change anything.  My
>     question is
>     >> whether you think it's better for maintainability (in this
>     case?) to
>     >> write out two totally symmetric branches with duplicated logic
>     than it
>     >> is to eliminate that duplication, and if so, why.  I could also
>     take
>     >> your response to mean "I really don't have time to deal with
>     questions
>     >> like that," which would be fine, too.
>     >
>     > To properly answer your question I would have to write an
>     alternative
>     > version of the code in question.  So I'm asking you to pose your
>     > question in terms of alternative code I can study, instead of taking
>     > on the job myself of crafting that alternative.
>
>     Well, I wasn't looking for an answer that's any more "proper" than
>     your
>     statement above that the current code is easier to reason about and
>     maintain.  It's OK with me if you don't feel like sticking your
>     neck out
>     further on this one without something more solid to look at, but while
>     that project does sound like a lot of fun, it was a mostly-idle
>     question
>     and I don't have time to take it on at the moment.
>
>     Cheers,
>
>     --
>     Dave Abrahams
>     BoostPro Computing
>     http://www.boostpro.com
>     _______________________________________________
>     cfe-dev mailing list
>     cfe-dev at cs.uiuc.edu <mailto:cfe-dev at cs.uiuc.edu>
>     http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev