[llvm-dev] [RFC] lld: Dropping TLS relaxations in favor of TLSDESC

Tue Nov 7 18:59:10 PST 2017

Rui Ueyama via llvm-dev <llvm-dev at lists.llvm.org> writes:

> tl;dr: TLSDESC have solved most problems in formerly inefficient TLS access
> models, so I think we can drop TLS relaxation support from lld.
>
> lld's code to handle relocations is a mess; the code consists of a lot of
> cascading "if"s and needs a lot of prior knowledge to understand what it is
> doing. Honestly it is head-scratching and needs serious refactoring. I'm
> trying to simplify it to make it manageable again, and I'm now focusing on
> the TLS relaxations.
>
> Thread-local variables in ELF is complicated. The ELF TLS specification [1]
> defines 4 different access models: General Dynamic, Local Dynamic, Initial
> Exec and Local Exec.
>
> I'm not going into the details of the spec here, but the reason why we have
> so many different models for the same feature is because they were
> different in speed, and we have to use (formerly) slow models when we know
> less about their run-time memory layout at compile-time or link-time. So,
> there was a trade-off between generality and performance. For example, if
> you want to use thread-local variables in a dlopen(2)'able DSO, you need to
> choose the slowest model. If a linker knows at link-time that a more
> restricted access model is applicable (e.g. if it is linking a main
> executable, it knows for sure that it is not creating a DSO that will be
> used via dlopen), the linker is allowed to rewrite instructions to load
> thread-local variables to use a faster access model.
>
> What makes the situation more complicated is the presence of a new method
> of accessing thread-local variables. After the ELF TLS spec was defined,
> TLSDESC [2] was proposed and implemented. With that method, General Dynamic
> and Local Dynamic models (that were pretty slow in the original spec) are
> as fast as much faster Initial Exec model. TLSDESC doesn't have a trade-off
> of dlopen'ability and access speed. According to [2], it also reduces the
> size of generated DSOs. So it seems like TLSDESC is strictly a better way
> of accessing thread-local variables than the old way, and the thread-local
> variable's performance problem (that the TLS ELF spec was trying to address
> by defining four different access models and relaxations in between)
> doesn't seem a real issue anymore.
>
> lld supports all TLS relaxations as defined by the ELF TLS spec. I accepted
> the patches to implement all these features without thinking hard enough
> about it, but on second thought, that was likely a wrong decision. Being a
> new linker, we don't need to trace the history of the evolution of the ELF
> spec. Instead, we should have implemented whatever it makes sense now.
>
> So, I'd like to propose we drop TLS relaxations from lld, including Initial
> Exec → Local Exec. Dropping IE→LE is strictly speaking a degradation, but I
> don't think that is important. We don't have optimizations for much more
> frequent variable access patterns such as locally-accessed variables that
> have GOT slots (which in theory we can skip GOT access because GOT slot
> values are known at link-time), so it is odd that we are only serious about
> TLS variables, which are usually much less important. Even if it would turn
> out that we want it after implementing more important relaxations, I'd like
> to drop it for now and reimplement it in a different way later.
>
> This should greatly simplifies the code because it does not only reduce the
> complexity and amount of the existing code, but also reduces the amount of
> knowledge you need to have to read the code, without sacrificing
> performance of lld-generated files in practice.
>
> Thoughts?

I don't think we can do it.

The main thing we have to keep in mind is that not everyone is using
TLSDESC. In fact, clang doesn't even support -mtls-dialect=gnu2.

If everyone switches to TLSDESC, then I am OK with dropping
optimizations for the old model.

But even with TLSDESC we still need linker relaxations. The TLSDESC idea
solves some of the GD -> IE cost in the case where the .so is not
dlopened, but that is it. Note that AARCH64 that is TLSDESC only has
relaxations.

So I am strongly against removing either non TLSDESC support of support
for the relaxations.

Cheers,
Rafael