[LLVMbugs] [Bug 14309] New: Incorrect optimization of thread_local variables

Fri Nov 9 17:54:46 PST 2012

http://llvm.org/bugs/show_bug.cgi?id=14309

             Bug #: 14309
           Summary: Incorrect optimization of thread_local variables
           Product: new-bugs
           Version: trunk
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: new bugs
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: tbergan at cs.washington.edu
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified

Created attachment 9516
  --> http://llvm.org/bugs/attachment.cgi?id=9516
Minimal C program that demonstrates the bug

I believe there is a bug in the way the optimizer deals with thread_local
variables.  The attached program, test.c, has a thread-local variable "int Foo"
and a global variable "int *Ptr".  The program takes the following steps:

1) The main thread spawns a new thread and waits
2) The new thread writes Foo = 50 and Ptr = &Foo, then signals the main thread
and waits
3) The main thread prints *Ptr, releases the new thread, and exits

The crux of this example is that the main thread obtains a pointer to the new
thread's TLS via "Ptr".  When I compile with gcc, the program prints "50" as
expected.  When I compile with LLVM, the program prints "0".  To demonstrate
the bug, run the following commands with the attached "test.c" (verified with
revision 167568):
$ clang -O3 -lpthread test.c -o test
$ ./test   # prints "Foo: 0"

I'm attaching the following files:
* test.c
* test.0.ll, which was built with "clang -emit-llvm -S -O0 test.c -o test.0.ll"
* test.3.ll, which was built with "clang -emit-llvm -S -O3 test.c -o test.3.ll"

It is pretty clear that "test.3.ll" is an incorrect optimization of
"test.0.ll".  You can see the bug in main(), where LLVM has optimized the load
"*Ptr" into the following instructions:

  %.b = load i1* @Foo.b, align 1   ; main() loads its own @Foo.b, not the
@Foo.b written by run()
  %0 = select i1 %.b, i32 50, i32 0

My guess is that the optimizer does not realize that thread_local addresses are
not constant in the same way that global addresses are constant, since each
thread_local variable actually names N variables, one for each of N running
threads.  Thus, it's not safe to optimize across two accesses of a thread_local
variable unless it can be proven that both accesses will be performed by the
same thread.

In terms of LLVM's design, I've noticed that thread_local variables are
represented in the same way as ordinary global variables (via
llvm::GlobalVariable) except that the "isThreadLocal" flag is true.  This
strikes me as a potential for confusion, because you have this one corner case
-- thread_locals -- in which an "llvm::Constant" is not really a "constant" in
the same way as other constants.  This might be related to
http://llvm.org/bugs/show_bug.cgi?id=13720, and perhaps a few other bugs.

-Tom

-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.