[PATCH] D131309: [ELF] Add ability to get a symbol by name from the hash table section
    James Henderson via Phabricator via llvm-commits 
    llvm-commits at lists.llvm.org
       
    Mon Aug 15 01:25:21 PDT 2022
    
    
  
jhenderson added inline comments.
================
Comment at: llvm/include/llvm/Object/ELF.h:505-516
+  if (Sec.sh_type != ELF::SHT_HASH)
+    return createError("invalid sh_type for hash table, expected SHT_HASH");
+  Expected<Elf_Shdr_Range> SectionsOrError = sections();
+  if (!SectionsOrError)
+    return SectionsOrError.takeError();
+
+  auto SymTabOrErr = object::getSection<ELFT>(*SectionsOrError, Sec.sh_link);
----------------
All these checks still need test cases.
================
Comment at: llvm/include/llvm/Object/ELF.h:524-526
+  if (Sec.sh_offset + Sec.sh_size >= Buf.size())
+    return createError("section " + getSecIndexForError(*this, Sec) +
+                       "has invalid sh_offset: " + Twine(Sec.sh_offset));
----------------
This is certainly part of what is needed, but it doesn't address the issue of if the sh_size is too small for the hash table to be valid (e.g. it is too small for the nchain or nbucket fields, or the size derived from this information is different to sh_size).
================
Comment at: llvm/include/llvm/Object/ELF.h:534-535
+  const uint32_t NBucket = HashTab->nbucket;
+  const auto Bucket = HashTab->buckets();
+  const auto Chain = HashTab->chains();
+
----------------
I would very much prefer not to use `auto` here. It's not obvious what the type of Bucket and Chain are.
================
Comment at: llvm/include/llvm/Object/ELF.h:530
+  for (uint32_t I = Bucket[Hash % NumBuckets]; I != 0; I = Chain[I]) {
+    if (StrTab.drop_front(SymTab[I].st_name).data() == Name)
+      return &SymTab[I];
----------------
jhenderson wrote:
> `I` needs validating to ensure it doesn't fall outside the range of `SymTab`.
Not sure I see where this check is made, even though it's marked as done?
================
Comment at: llvm/unittests/Object/ELFObjectFileTest.cpp:823-825
+    Flags:           [ SHF_ALLOC ]
+    Address:         0x100
+    Link:            .dynsym
----------------
`SHF_ALLOC` flags and `Link: .dynsym` are implicit for SHT_HASH sections in YAML, so you can omit those two fields for simplicity.
================
Comment at: llvm/unittests/Object/ELFObjectFileTest.cpp:826
+    Link:            .dynsym
+    AddressAlign:    0x8
+    Bucket:          [ 1 ]
----------------
I think the `AddressAlign` field is unnecessary? (Also for the other sections)
================
Comment at: llvm/unittests/Object/ELFObjectFileTest.cpp:831-833
+    Flags:           [ SHF_ALLOC ]
+    Address:         0x200
+    Link:            .dynstr
----------------
Similar to above, `SHF_ALLOC` and `Link: .dynstr` are implicit, so don't need specifying.
================
Comment at: llvm/unittests/Object/ELFObjectFileTest.cpp:842-845
+    Flags:           [ SHF_WRITE, SHF_ALLOC ]
+    Address:         0x2000
+    AddressAlign:    0x4
+    Content:         DEADBEEFDEADBEEF
----------------
I think it's probably irrelevant to the test what the contents/address of the section is: I don't believe anything validates that the symbols appear in the claimed section. You could even just use SHN_ABS dynamic symbols, if you wanted to avoid the question entirely. Removing these details would improve the signal to noise ratio in the setup data.
================
Comment at: llvm/unittests/Object/ELFObjectFileTest.cpp:864
+
+  auto HashOrErr = Elf.getSection(/* SHT_HASH */ 1);
+  ASSERT_THAT_EXPECTED(HashOrErr, Succeeded());
----------------
Nit: the preferred style for these comments is as inline - clang-format recognises this specific syntax and formats it slightly differently to other inline comments. Applies below too.
================
Comment at: llvm/unittests/Object/ELFObjectFileTest.cpp:867
+
+  auto StrTabOrErr = Elf.getSection(/* SHT_STRTAB */ 3);
+  ASSERT_THAT_EXPECTED(StrTabOrErr, Succeeded());
----------------
There's a lot of unhelpful `auto` in this test, which I don't think really conforms to LLVM's style guide.
================
Comment at: llvm/unittests/Object/ELFObjectFileTest.cpp:871
+  ASSERT_THAT_EXPECTED(ContentsOrErr, Succeeded());
+  auto StrTab = StringRef(ContentsOrErr->begin(), ContentsOrErr->size());
+
----------------
Seems simpler?
================
Comment at: llvm/unittests/Object/ELFObjectFileTest.cpp:940
+  // Ensure that an invalid string table index is properly handled.
+  typename ELF64LE::Sym *Sym = const_cast<typename ELF64LE::Sym *>(*SymOrErr);
+  Sym->st_name = 9999;
----------------
Rather than const_cast here, you should be able to use an `StName` field in the YAML. See https://github.com/llvm/llvm-project/blob/main/llvm/test/tools/yaml2obj/ELF/symbol-name.yaml for examples.
================
Comment at: llvm/unittests/Object/ELFObjectFileTest.cpp:943
+  auto BadIndexErr = Elf.getSymbolFromHashTable(**HashOrErr, "x");
+  ASSERT_THAT_EXPECTED(BadIndexErr, Failed());
+
----------------
Use `FailedWithMessage` to check the error message contents. Same elsewhere.
================
Comment at: llvm/unittests/Object/ELFObjectFileTest.cpp:945
+
+  // Ensure that an invalid symbol table size in properly handled.
+  typename ELF64LE::Shdr *SymTab =
----------------
================
Comment at: llvm/unittests/Object/ELFObjectFileTest.cpp:946-947
+  // Ensure that an invalid symbol table size in properly handled.
+  typename ELF64LE::Shdr *SymTab =
+      const_cast<typename ELF64LE::Shdr *>(*SymTabOrErr);
+  SymTab->sh_size = 9999;
----------------
You can use the .symtab's `ShOffset` field in the YAML to force this without needing the const_cast.
================
Comment at: llvm/unittests/Object/ELFObjectFileTest.cpp:952
+
+  // Ensure that an invalid hash table size in properly handled.
+  typename ELF64LE::Shdr *HashTab =
----------------
================
Comment at: llvm/unittests/Object/ELFObjectFileTest.cpp:955
+      const_cast<typename ELF64LE::Shdr *>(*HashOrErr);
+  HashTab->sh_size = 9999;
+  auto BadHashErr = Elf.getSymbolFromHashTable(**HashOrErr, "x");
----------------
You can use the .hash's `ShOffset` field in the YAML to force this without needing the const_cast.
================
Comment at: llvm/unittests/Object/ELFObjectFileTest.cpp:878
+    ASSERT_THAT_EXPECTED(FoundSymNameOrErr, Succeeded());
+    ASSERT_TRUE(*FoundSymNameOrErr == SymbolName);
+  }
----------------
yaxunl wrote:
> I am wondering whether it is useful to add a test for getting symbol values here.
I'm in two minds about it. However, if preferring to test just the symbol names, the other symbol properties can be omitted to simplify the YAML.
Repository:
  rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131309/new/
https://reviews.llvm.org/D131309
    
    
More information about the llvm-commits
mailing list