[Lldb-commits] [PATCH] D76758: Augment lldb's symbol table with external symbols in Mach-O's dyld trie

Greg Clayton via Phabricator via lldb-commits lldb-commits at lists.llvm.org
Thu Mar 26 12:29:31 PDT 2020


clayborg added a comment.

So I know the mach-o symbol table parsing code is a mess already, but it seems like this patch can be simpler if we make a std::set<lldb:addr_t> at the top of ObjectFileMachO::ParseSymtab() and every time we add a symbol that has a valid address value, add the file addr to this set. This will avoid the need to do any sorting. This std::set can be used to not add LC_FUNCTION_START entries that already have a symbol at the address, and it can be used to only add TrieEntry values that have symbols. Thoughts?



================
Comment at: lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp:1924
+  // symbol table, and this entry should not be added.
+  void SymbolAlreadyPresent() { flags = LLDB_INVALID_ADDRESS; }
+  bool IsSymbolAlreadyPresent() {
----------------
 The only bits that are used in this field are:

```
enum {
  EXPORT_SYMBOL_FLAGS_KIND_MASK = 0x03u,
  EXPORT_SYMBOL_FLAGS_WEAK_DEFINITION = 0x04u,
  EXPORT_SYMBOL_FLAGS_REEXPORT = 0x08u,
  EXPORT_SYMBOL_FLAGS_STUB_AND_RESOLVER = 0x10u
};

enum ExportSymbolKind {
  EXPORT_SYMBOL_FLAGS_KIND_REGULAR = 0x00u,
  EXPORT_SYMBOL_FLAGS_KIND_THREAD_LOCAL = 0x01u,
  EXPORT_SYMBOL_FLAGS_KIND_ABSOLUTE = 0x02u
};
```

So why not just set the highest bit and avoid clobbering all of the other flags?:

```
#define EXPORT_SYMBOL_FLAGS_LLDB_ALREADY_PRESENT 1ull << 63
void SymbolAlreadyPresent() { flags |= EXPORT_SYMBOL_FLAGS_LLDB_ALREADY_PRESENT; }
```


================
Comment at: lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp:1926
+  bool IsSymbolAlreadyPresent() {
+    if (flags == LLDB_INVALID_ADDRESS)
+      return true;
----------------
jingham wrote:
> JDevlieghere wrote:
> > `return flags == LLDB_INVALID_ADDRESS`
> This sort of change has the downside that you can't break when flags == LLDB_INVALID_ADDRESS without adding a condition.  That seems in this case like a condition you are likely to want to break on, and given this looks like a function that gets called a lot, it's probably good to pay the cost of a condition.
> 
> I'm not sure I'm in favor of this sort of rewrite.  It just saves one line, and isn't particularly easier to read.  Does it have some benefit I'm missing?  
So is "flags" initially used just after parsing, but before we mark TrieEntry values as already present? These flags mean something when we first parse them. I would rather add an extra bool entry to this structure, since they don't stay around for long.


================
Comment at: lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp:1926
+  bool IsSymbolAlreadyPresent() {
+    if (flags == LLDB_INVALID_ADDRESS)
+      return true;
----------------
clayborg wrote:
> jingham wrote:
> > JDevlieghere wrote:
> > > `return flags == LLDB_INVALID_ADDRESS`
> > This sort of change has the downside that you can't break when flags == LLDB_INVALID_ADDRESS without adding a condition.  That seems in this case like a condition you are likely to want to break on, and given this looks like a function that gets called a lot, it's probably good to pay the cost of a condition.
> > 
> > I'm not sure I'm in favor of this sort of rewrite.  It just saves one line, and isn't particularly easier to read.  Does it have some benefit I'm missing?  
> So is "flags" initially used just after parsing, but before we mark TrieEntry values as already present? These flags mean something when we first parse them. I would rather add an extra bool entry to this structure, since they don't stay around for long.
```
return (flags & EXPORT_SYMBOL_FLAGS_LLDB_ALREADY_PRESENT) != 0;
```


================
Comment at: lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp:1965
+  // Terminal node -- end of a branch, possibly add this to
+  // the symbol table or resoler table.
   const uint64_t terminalSize = data.GetULEB128(&offset);
----------------
s/resoler/resolver/


================
Comment at: lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp:2503-2516
     ConstString text_segment_name("__TEXT");
     SectionSP text_segment_sp =
         GetSectionList()->FindSectionByName(text_segment_name);
     if (text_segment_sp) {
       const lldb::addr_t text_segment_file_addr =
           text_segment_sp->GetFileAddress();
       if (text_segment_file_addr != LLDB_INVALID_ADDRESS) {
----------------
Maybe we should run this before ParseTrieEntries and pass text_segment_file_addr in as an argument and add this to the offset as we parse the TrieEntry objects?


================
Comment at: lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp:4490
 
+  // Look through the current symbol table, if any symbols match
+  // an entry in the external_sym_trie_entries vector, mark it
----------------
Might be easier to create a std::set<lldb::addr_t> at the top of the ObjectFileMachO::ParseSymtab() function. Anyone who adds a symbol, would add the file address to that set. Then we wouldn't have to sort these entries, we would just iterator through them. We must do something like this for LC_FUNCTION_STARTS already right?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76758/new/

https://reviews.llvm.org/D76758





More information about the lldb-commits mailing list