           Summary: invalid TLS with -relocation-model=pic -regalloc=local
@a = thread_local global i32 0          ; <i32*> [#uses=2]

define void @f(i32* nocapture %c, i32* nocapture %d) nounwind optsize {
        %0 = load i32* @a, align 4              ; <i32> [#uses=1]
        store i32 %0, i32* %c, align 4
        %1 = load i32* @a, align 4              ; <i32> [#uses=1]
        store i32 %1, i32* %d, align 4
        ret void

with llc -relocation-model=pic -regalloc=local produces

        .byte   0x66; leaq      a at TLSGD(%rip), %rdi; .word      0x6666; rex64
        movq    %rax, -8(%rsp)
        movq    %rcx, -16(%rsp)
        call    __tls_get_addr at PLT

This is invalid. The call must be just after the rex64.

I think this is a more general problem with the local register allocator. It is
inserting spills between two instructions linked by a flag.

