[PATCH] D67390: [LLDB][ELF] Load both, .symtab and .dynsym sections

Konrad Wilhelm Kleine via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 30 04:26:50 PDT 2019


kwk marked 8 inline comments as done.
kwk added a comment.

In D67390#1685484 <https://reviews.llvm.org/D67390#1685484>, @labath wrote:

> This looks fairly ok to me, but it could use a more explicit test of the symbol uniqueing code. Right, now I believe the two tests you added check that the symbols are _not_ uniqued. (They're also a bit hard to follow due to the whole
>
>   objcopy business). Could you create a test with a yaml file which will contain various edge cases relevant to this code. I'm thinking of stuff like "a dynsym and a symtab symbol at the same address, but a different name", "a dynsym and symtab symbols with identical names, but different addresses", etc. Then just run "image dump symtab" on that file to check what we have parsed?


I'll give my best to implement this today.

> I am also surprised that you weren't able to just use a Section* (instead of the name) for the uniqueing. I'd expect that all symbols (even those from the separate object file) should be resolved to the sections in the main object. I see that this isn't the case, but I am surprised that this isn't causing any problems. Anyway, as things seem to be working as they are now, we can leave that for another day.

Okay.

> In D67390#1685313 <https://reviews.llvm.org/D67390#1685313>, @kwk wrote:
> 
>> Before I used the bare symbol name with stripped `@VERSION` suffix. Now I've changed the implementation of `NamedELFSymbol` to include the `@VERSION` suffix and the tests pass.
> 
> 
> Interesting. I'm pretty sure that the symbol count is irrelevant for that test (it just wants to know if it is there), so we can change that if needed. However, having the uniqueing include the `@VERSION` sounds right to me, so if that makes the test happy too then, great.

Yes, I hoped so. Thank you. Please await another revision of this patch with the tests requested.



================
Comment at: lldb/source/Plugins/ObjectFile/ELF/ELFHeader.cpp:371-373
+  r.st_name = st_name;
+  return elf::ELFSymbol::operator==(r) &&
+         st_name_string == rhs.st_name_string;
----------------
labath wrote:
> kwk wrote:
> > clayborg wrote:
> > > I would almost just manually compare only the things we care about here. Again, what about st_shndx when comparing a symbol from the main symbol table and one from the .gnu_debugdata symbol table. Are those section indexes guaranteed to match? I would think not. 
> > @clayborg I explicitly only compare what we care about and therefore always set the name index to  be the same.
> I'll echo @clayborg here. This business with copying the ELFSymbol and clearing some fields is confusing. Do you even need the ELFSymbol::operator== for anything? If not I'd just delete that, and have the derived version compare all fields.
> 
> Also, it would be nice if the operator== and hash function definitions were next to each other. Can you just forward declare the std::hash stuff in the header, and have the implementation be next to this?
> I'll echo @clayborg here. This business with copying the ELFSymbol and clearing some fields is confusing.

I've cleared up the documentation now and it is exactly the way I like it. Every entity deals with it's own business (respects its own fields when comparing). I find it pretty dirty to compare fields from the base struct in a derived one. The way I compare fields from the base struct is minimally invasive.

> Do you even need the ELFSymbol::operator== for anything?

Yes, when you want to compare ELFSymbols. I know that I don't do that explicitly but I the function only deals with fields from the entity itself and they don't spill into any derived structure (with the exception of explicitly ignoring fields).

> If not I'd just delete that, and have the derived version compare all fields.

No because I call it explcitly from the derived NamedELFSymbol.

> Also, it would be nice if the operator== and hash function definitions were next to each other. Can you just forward declare the std::hash stuff in the header, and have the implementation be next to this?

I've found a compromise that is even more appealing to me. The ELFSymbol and NamedELFSymbol structs now have a `hash` function which contains the implementation next to the one of `operator==()`. This `hash` is called in the specialization which remains in the same location as before; the reason being that I didn't find a way do define something in the `std::` namespace when I'm in the `elf::` namespace.




================
Comment at: lldb/source/Plugins/ObjectFile/ELF/ELFHeader.h:446
+    std::size_t h2 = std::hash<const char *>()(s.st_name_string.AsCString());
+    std::size_t h3 = std::hash<const char *>()(s.st_section_name_string.AsCString());
+    return llvm::hash_combine(h1, h2, h3);
----------------
jankratochvil wrote:
> I find better to rather define `std::hash<ConstString>` (or provide `ConstString::Hasher` which I do in my DWZ patchset).
Once your DWZ patchset arrives, please let me know and we can change it.


================
Comment at: lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp:2205
+        needle.st_section_name_string = ConstString(symbol_section_sp->GetName());
+    if (unique_elf_symbols.find(needle) == unique_elf_symbols.end()) {
+      symtab->AddSymbol(dc_symbol);
----------------
labath wrote:
> jankratochvil wrote:
> > labath wrote:
> > > something like `if (unique_elf_symbols.insert(needle).second)` would be more efficient, as you don't need to mess with the map twice.
> > I would unconditionally add all the symbols to `std::vector<NamedElfSymbol> unique_elf_symbols` (with `unique_elf_symbols.reserve` in advance for the sym of `.symtab`+`.dynsym` sizes) and then after processing both `.symtab` and `.dynsym` and can `llvm::sort(unique_elf_symbols)` and add to `symtab` only those which are unique. I believe it will be much faster, one could benchmark it.
> sounds like a good idea.
@jankratochvil @labath yes, this sound like a good idea for *performance* improvement but honestly, I need to get this patch done first in order to make any progress with minidebuginfo. I hope you don't mind when I take this task to another patch, okay?


================
Comment at: lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp:2650
+    // A unique set of ELF symbols added to the symtab
+    UniqueElfSymbolColl unique_elf_symbols({});
     SectionList *section_list = module_sp->GetSectionList();
----------------
labath wrote:
> what's wrong with the old-fashioned `UniqueElfSymbolColl unique_elf_symbols;` ?
Apparently nothing now :) . But before I had troubles because of some deleted default constructor. Thanks for spotting this and bringing it back to my attention.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67390/new/

https://reviews.llvm.org/D67390





More information about the llvm-commits mailing list