[LLVMdev] About thread_local in 3.0
Duncan Sands
baldrick at free.fr
Wed Jul 4 06:58:32 PDT 2012
Hi Jianzhou Zhao, I think it is simple: the JIT doesn't support thread local
storage, and thus aborts when it sees you trying to use a thread local variable.
It would be nicer if it printed a helpful error message though. On my machine
I get
$ lli tl.ll
Cannot allocate thread local storage on this arch!
UNREACHABLE executed at llvm/lib/Target/X86/X86JITInfo.cpp:576!
The new MC-JIT (pass -use-mcjit to lli) should support this one day, but I'm
not sure it does yet. As for the interpreter, it should output an error
message saying it doesn't support thread local storage rather than producing
a random wrong result.
You could open a bug report about the poor error messages. As for thread local
support I think the MC-JIT developers have this on their list of things to do
already.
Ciao, Duncan.
>
> I am using 3.0, and I have a question about the __thread in c and
> thread_local in LLVM IR: O1-O4 and the final linked code behave
> differently.
>
> ////////////////////////////////////
> The following C code is from the LLVM testcase
> (SingleSource/UnitTests/Threads/2010-12-08-tls.c)
>
> #include <stdio.h>
>
> __thread int a = 4;
>
> int foo (void)
> {
> return a;
> }
>
> int main (void) {
> printf("a is %d\n", foo());
> return 0;
> }
>
> It contains a __thread attribute. Both GCC's output program and
> clang's produce "a is 4", and then return. The command line options
> are the default.
> gcc 2010-12-08-tls.c
> clang 2010-12-08-tls.c
>
> ////////////////////////////////////
> The following code (2010-12-08-tls.O1.ll) is generated by
> clang -c -O1 -emit-llvm 2010-12-08-tls.c -o 2010-12-08-tls.O1.bc
> llvm-dis 2010-12-08-tls.O1.bc
>
> @a = thread_local global i32 4, align 4
> @.str = private unnamed_addr constant [9 x i8] c"a is %d\0A\00", align 1
>
> define i32 @foo() nounwind readonly {
> %1 = load i32* @a, align 4, !tbaa !0
> ret i32 %1
> }
>
> define i32 @main() nounwind {
> %1 = tail call i32 @foo()
> %2 = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds
> ([9 x i8]* @.str, i32 0, i32 0), i32 %1) nounwind
> ret i32 0
> }
>
> declare i32 @printf(i8* nocapture, ...) nounwind
>
> !0 = metadata !{metadata !"int", metadata !1}
> !1 = metadata !{metadata !"omnipotent char", metadata !2}
> !2 = metadata !{metadata !"Simple C/C++ TBAA", null}
>
> If I jit the code, say "lli 2010-12-08-tls.O1.bc", I got
>
> 0 lli 0x091ac1a8
> 1 lli 0x091ac7e7
> 2 0xffffe400 + 0
> 3 lli 0x08f7a103
> llvm::ExecutionEngine::runFunctionAsMain(llvm::Function*,
> std::vector<std::string, std::allocator<std::string> > const&, char
> const* const*) + 1459
> 4 lli 0x0861df9e main + 3374
> 5 libc.so.6 0xb75b7ace __libc_start_main + 254
> Stack dump:
> 0. Program arguments: lli 2010-12-08-tls.O4.bc
> Segmentation fault
>
> If I lli the code, say "lli --force-interpreter=true
> 2010-12-08-tls.O1.bc", I got
> "a is 24"
>
> ////////////////////////////////////
> The following code (2010-12-08-tls.O4.ll) is generated by
> clang -c -O4 -emit-llvm 2010-12-08-tls.c -o 2010-12-08-tls.O4.bc
> llvm-dis 2010-12-08-tls.O4.bc
>
> @a = thread_local global i32 4, align 4
> @.str = private unnamed_addr constant [9 x i8] c"a is %d\0A\00", align 1
>
> define i32 @foo() nounwind readonly {
> %1 = load i32* @a, align 4, !tbaa !0
> ret i32 %1
> }
>
> define i32 @main() nounwind {
> %1 = load i32* @a, align 4, !tbaa !0
> %2 = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds
> ([9 x i8]* @.str, i32 0, i32 0), i32 %1) nounwind
> ret i32 0
> }
>
> declare i32 @printf(i8* nocapture, ...) nounwind
>
> !0 = metadata !{metadata !"int", metadata !1}
> !1 = metadata !{metadata !"omnipotent char", metadata !2}
> !2 = metadata !{metadata !"Simple C/C++ TBAA", null}
>
> We can see, O4 does inlining, but otherwise its output is the same to
> O1's. And I got the same results from jit and lli.
>
> ////////////////////////////////////
> The following code (2010-12-08-tls.ld.ll) is generated by
> opt -std-link-opt 2010-12-08-tls.O1.bc -o 2010-12-08-tls.ld.bc
> llvm-dis 2010-12-08-tls.ld.bc
>
> @.str = private unnamed_addr constant [9 x i8] c"a is %d\0A\00", align 1
>
> define i32 @main() nounwind {
> %1 = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds
> ([9 x i8]* @.str, i32 0, i32 0), i32 4) nounwind
> ret i32 0
> }
>
> declare i32 @printf(i8* nocapture, ...) nounwind
>
> We can see link-opt value-numbered %1 with 4. Now, jit and lli produce
> the results "a is 24", which is same to clang and gcc's native
> output's. I am not familiar with thread_local and the __thread in C,
> was wondering if this is the expected behavior of thread_local.
> Thanks.
>
More information about the llvm-dev
mailing list