[PATCH] D131309: [ELF] Add ability to get a symbol by name from the hash table section

James Henderson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Aug 12 01:36:50 PDT 2022


jhenderson added a comment.



In D131309#3715793 <https://reviews.llvm.org/D131309#3715793>, @jhuber6 wrote:

> In D131309#3715186 <https://reviews.llvm.org/D131309#3715186>, @jhenderson wrote:
>
>> I think you might be reinventing the wheel here. llvm-readobj already has code for printing the symbols derived from the hash table - see `printHashTableSymbols`. Can you reuse that code? Obviously, this might require refactoring some of the llvm-readobj logic, so that it is in libObject rather than the tool specifically.
>
> The usage is somewhat different, as far as I can tell the `llvm-readobj` implementation is only concerned with printing all the symbols in the hash table. This function is intended to be used to look up a specific symbol if we already know the name.

Oops, yes, I see. Not sure how I missed that.

Could you clarify the use-case a bit more, please? The hash table is primarily for dynamic linking purposes, and consequently section headers aren't guaranteed to exist when doing this kind of lookup, with the dynamic tags like DT_HASH being used instead.



================
Comment at: llvm/include/llvm/Object/ELF.h:518-519
+
+  const uint32_t *HashTab =
+      reinterpret_cast<const uint32_t *>(base() + Sec.sh_offset);
+  const Elf_Sym *SymTab =
----------------
You need to make sure that this value is within the ELF, and that there's enough size for the hash table. You should also validate that the sh_size is large enough to fit a valid hash table in, so that you don't try to look for e.g. the nbuckets field if the size is 0 or the Bucket or Chain arrays if the size is too small.


================
Comment at: llvm/include/llvm/Object/ELF.h:520-521
+      reinterpret_cast<const uint32_t *>(base() + Sec.sh_offset);
+  const Elf_Sym *SymTab =
+      reinterpret_cast<const Elf_Sym *>(base() + (*SymTabOrErr)->sh_offset);
+  StringRef StrTab = *StrTabOrErr;
----------------
Similarly, you should validate that the symbol table fits within the ELF.


================
Comment at: llvm/include/llvm/Object/ELF.h:529
+
+  for (uint32_t I = Bucket[Hash % NumBuckets]; I != 0; I = Chain[I]) {
+    if (StrTab.drop_front(SymTab[I].st_name).data() == Name)
----------------
I'd use `STN_UNDEF` rather than 0, since that's what the spec uses.


================
Comment at: llvm/include/llvm/Object/ELF.h:530
+  for (uint32_t I = Bucket[Hash % NumBuckets]; I != 0; I = Chain[I]) {
+    if (StrTab.drop_front(SymTab[I].st_name).data() == Name)
+      return &SymTab[I];
----------------
`I` needs validating to ensure it doesn't fall outside the range of `SymTab`.


================
Comment at: llvm/unittests/Object/ELFObjectFileTest.cpp:812
+// using its associated name.
+TEST(ELFObjectFileTest, SymbolFromHashTableTest) {
+  SmallString<0> Storage;
----------------
Your testing needs to be expanded to cover the various failure cases that you check for in your function.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131309/new/

https://reviews.llvm.org/D131309



More information about the llvm-commits mailing list