[Lldb-commits] [PATCH] D65282: ObjectFileELF: permit thread-local sections with overlapping file addresses

Fri Jul 26 03:06:37 PDT 2019

labath marked 2 inline comments as done.
labath added a comment.

In D65282#1602244 <https://reviews.llvm.org/D65282#1602244>, @MaskRay wrote:

> > Are you referring to the "image lookup" command specifically, or is it a more general question about the internals of lldb too?
>
> Both:) This patch doesn't change the `Address: a.o[0x00001010] (a.o.PT_LOAD[0]..tdata + 0)` output so I was puzzled what this patch intends to do.

What do you mean by "doesn't change"? After this patch the addresses always resolve to the .data section..

> 
> 
>> However, I can see how it might be interesting to be able to see the initial value of a thread local variable much like we can display the initial value of a global variable without launching a process. For this case, a flag to Section::ContainsFileAddress saying "yes, I want to look up in thread-local sections now" would suffice, but I don't know if this is the only use case...
> 
> Yes, inspecting the initial value of a thread-local variable is a use case. To that end, can this be done by introducing another member variable instead of overloading `m_sections_up` with a new purpose (adding `PT_TLS`)? If PT_TLS is recorded in a different variable, the change below can be deleted.

I think that would be pretty significant departure from the current design of lldb. Lldb expects that the "section list" of a module will contain all of the module's sections (and before I started messing with these functions, it did). This includes non-loadable sections like .debug_info et al. While one could concieve a world where tls sections are in a special "tls" section list, I am not sure this is actually useful -- if we're going to think of the tls addresses as address spaces, then its reasonable to have more than two address spaces one day (there are people interested in that), and so we couldn't have a fixed set of section lists.

> 
> 
>    bool Section::ContainsFileAddress(addr_t vm_addr) const {
>      const addr_t file_addr = GetFileAddress();
>   -  if (file_addr != LLDB_INVALID_ADDRESS) {
>   +  if (file_addr != LLDB_INVALID_ADDRESS && !IsThreadSpecific()) {
> 
> 
> (An adjacent pair of PT_LOAD segments can load the same file contents, e.g. PT_LOAD `[0x150, 0x1234)` and `[0x1234, 0x1800)` will transform to mmap calls with ranges `[0, 0x2000)` and `[0x1000, 0x2000)` at runtime if the runtime page size = 0x1000. They share one page in the file. If you ask what a specific offset in the file is mapped to, there can be multiple PT_LOAD segments (physical -> VMA is not unique). Fortunately the reverse mapping VMA -> physical offset can be treated as unique in practice (`[p_vaddr,p_vaddr+p_memsz)` ranges do not overlap).)

I am not 100% what you mean by this, but I think there's some confusion about names of things here. In lldb terms, a "file address" is the "load address, as it is written in the file. It is not the "physical offset within the file", which lldb calls "file offset". Unfortunately, this terminology has caused a lot of confusion in the past, but I don't know what would be the best way to resolve this. How does lld call these things? I guess there's less confusion there as lld does not have to care about real, memory, load addresses...

================
Comment at: lit/Modules/ELF/PT_LOAD-overlap-PT_TLS.yaml:68
+    Flags:           [ SHF_ALLOC, SHF_WRITE, SHF_TLS ]
+    Address:         0x1010
+    AddressAlign:    0x4
----------------
MaskRay wrote:
> labath wrote:
> > MaskRay wrote:
> > > > .tbss 0x1000 NOBITS
> > > >
> > > > .tdata 0x1010 PROGBITS
> > > 
> > > Move .tdata before .tbss (0xff0) to make the example more realistic?
> > > 
> > > .tdata has a larger address than .tbss. I think this is impossible in ld.bfd, but you can make .tbss go before .tdata with a broken lld linker script.
> > I'll change the order here. The thing I was trying to test here is that addresses in .data are found regardless of whether it comes before or after a tls section. I think already having two TLS segments is somewhat unrealistic, and I could make it more real by splitting this into two tests, but it did not seem necessary, as lldb does not care about details like this.
> Multiple PT_TLS is unrealistic. None of glibc/musl/FreeBSD rtld supports more than 1 PT_TLS (they will just pick the last one and ignore the others).
Ok, so let's go for two tests then.

================
Comment at: lit/Modules/ELF/PT_LOAD-overlap-PT_TLS.yaml:62
+    Flags:           [ SHF_ALLOC, SHF_WRITE ]
+    Address:         0x1000
+    AddressAlign:    0x4
----------------
MaskRay wrote:
> `.data = .tbss = 0x1010` is a more realistic scenario.
> 
> Normally, a SHT_PROGBITS section may overlap with a SHT_NOBITS section, but two SHT_PROGBITS sections do not overlap (ld has an on-by-default check `ld --check-sections`). Linkers allocate file bytes for SHT_PROGBITS sections so their occupied bytes cannot be reused by other sections (without fixing addresses with a linker script).
Interesting. I can that easily, but I'm wondering, do you know the reason for that? Is it just how it falls out of the default linker processing of things, or would something actually break if I assigned identical addresses to two SHT_PROGBITS sections?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65282/new/

https://reviews.llvm.org/D65282