[LLVMdev] Advice on implementing fast per-thread data
Brian Hurt
bhurt at spnz.org
Tue Feb 5 16:24:26 PST 2008
On Tue, 5 Feb 2008, Chris Lattner wrote:
> On Mon, 4 Feb 2008, Brian Hurt wrote:
>> Another possibility, and I'm not sure how to do this in LLVM, would be to
>> sacrifice a register to hold the pointer to the unique per-thread
>> structure. This would be worthwhile to me even on the register-starved
>> x86-32. I suppose I could also just add a "hidden" (compiler-added and
>> -maintained) argument to every function which is the pointer to the
>> per-thread data.
>
> Thread local storage (TLS) on Linux is better than this. Instead of
> sacrificing a GPR, it uses a segment register to reach the TLS area,
> making it very very cheap.
>
>> Using the normal thread-local storage scares me, because I don't know the
>> performance implications.
>
> You should read up about it then. :)
> Start here: http://people.redhat.com/drepper/tls.pdf
>
Thank you. You've just made my life about 3000% easier. Somehow I've
missed __thread- I was thinking of the clunky POSIX threads
implementation.
Playing around a little bit with this, I find that:
static __thread int i;
int foo(void) {
i += 1;
return i;
}
compiles to:
foo:
pushl %ebp
movl %esp, %ebp
movl %gs:i at NTPOFF, %eax
addl $1, %eax
movl %eax, %gs:i at NTPOFF
popl %ebp
ret
So, other than the segment override, this is no different than accessing a
global variable. Which means I don't have to give up a clock cycle on
allocation speed for the common case (actually doing a collection is a
little bit trickier).
Brian
More information about the llvm-dev
mailing list