r230255 - Only lower __builtin_setjmp / __builtin_longjmp to

Hal Finkel hfinkel at anl.gov
Tue Mar 3 11:00:30 PST 2015


----- Original Message -----
> From: "Joerg Sonnenberger" <joerg at britannica.bec.de>
> To: cfe-commits at cs.uiuc.edu
> Sent: Tuesday, March 3, 2015 12:29:12 PM
> Subject: Re: r230255 - Only lower __builtin_setjmp / __builtin_longjmp to
> 
> On Tue, Mar 03, 2015 at 11:59:53AM -0600, Hal Finkel wrote:
> > Having implemented this, I assure you that the potential is not
> > small.
> > Eliminating the unnecessary spilling, the overhead of the function
> > call,
> > and better scheduling of the spills/restores, I've seen 10x
> > speedups
> > (even on modern OOO cores). Please also remember that small
> > functions
> > often don't use all available registers, especially vector
> > registers
> > (which tend to be expensive to save and restore), and so you can
> > just
> > ignore the caller-saved register entirely (you don't need to save
> > them
> > in the prologue or in setjmp call if you don't use them -- it is a
> > pure
> > savings).
> 
> Huh? How do you know that the intermediate functions haven't
> clobbered a
> register? Without unwinding, which we explicitly do not want to do
> here,
> you can't. As such you *can't* avoid the spilling.

You can because, for a caller-saved register, the caller saved them (if it, indeed, needed to do so). That's the nice thing about the builtins: they're call-site specific. So when you call setjmp, you don't need to save them if you don't need them afterward. When the caller of setjmp returns, its caller will restore those registers as needed (as it would have anyway). You do need to save callee-saved registers (along with any other registers the function is actually using).

I think it is also worth noting that, as we implement them, the job of restoring registers is shifted compared to the library calls. So, when using library setjmp/longjmp, setjmp saves all of the necessary state into the jump buffer, and longjmp restores all necessary state and jumps to the designated location. With the builtins, __builtin_setjmp itself saves very little state into the jump buffer (only the address and some reserved registers), but causes the function calling it to spill and restore only necessary state around it. __builtin_longjmp restores only the reserved registers necessary to make the jump, nothing more (the spill/restore code around the __builtin_setjmp takes care of the rest).

So, for example, if we have some register, v1, which is caller saved...

void bar(jmp_buf &jb) {
  // v1 is not used in this function, and the caller saved it if necessary
  // so v1 is not spilled here
  __builtin_setjmp(&jb);
  // and v1 is not restored here (the caller saved it if necessary, and will restore it if necessary, when we return)
}

foo() {
  jmp_buf jb;
  // v1 is saved here if necessary
  bar(jb);
  ...
  // the value of v1 saved above is loaded here somewhere
  ...
  __builtin_longjmp() // this does not explicitly restore v1 here, although a library call would need to
}
 
 -Hal

> 
> Joerg
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory



More information about the cfe-commits mailing list