[PATCH] D47414: [llvm-objcopy] Fix null symbol handling

James Henderson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 29 03:16:06 PDT 2018


jhenderson added a comment.

I've been thinking about the structure a lot here. It seems to me that the right thing to do IS to have a null symbol that `getSymbolByIndex()` returns. If a user has explicitly asked for the symbol at index 0, who are we to say they can't have it? As far as I can see, there is nothing in the ELF standard that prevents us having one.

Indeed, this is a wider problem - our internal implementation of the `SymbolTableSection` needs to not expose that we don't have a real null symbol. This includes things like how the size is calculated, how to iterate over things, whether or not it is empty etc. If there is a need to explicitly distinguish the null symbol from other symbols on a per-symbol basis, we should have something to identify it as such (e.g. Index == 0).

I know this is different from how the section header table works, and I'm beginning to wonder whether we made the wrong choice there too.

This then brings up the question, how do we prevent the symbol from being stripped? Unlike @alexshap, I'd prefer to keep the empty() check in the symbol table. Perhaps the definition of empty() should be "no symbols other than the null symbol"? In other words `Symbols.size() == 1`. The `removeSymbols` predicate would explicitly not remove the null symbol, or better yet, `SymbolTableSection::removeSymbols` (and probably also `updateSymbols`) start at the second symbol, rather than the first. What do you guys think?



================
Comment at: test/tools/llvm-objcopy/strip-all-and-keep-symbol.test:61
 #CHECK-NEXT:  Symbol {
+#CHECK-NEXT:    Name:
+#CHECK-NEXT:    Value: 0x0
----------------
Could you add something here (or in a different test) to show that we haven't given the null symbol a real name, please?


================
Comment at: tools/llvm-objcopy/Object.cpp:121
+  // We set the first byte of the string table to zero so that we ensure the
+  // zero symbol points to a '\0' character.
+  Buf[0] = 0;
----------------
zero symbol -> null symbol

It's worth pointing out that the standard requires the first byte to be a null character, so maybe this should be added as a string in the StringTableSection `initialize` method or constructor instead? That will also ensure that the size is correct, which could affect the behaviour of `assignOffsets` etc.


================
Comment at: tools/llvm-objcopy/Object.cpp:170
 void SymbolTableSection::assignIndices() {
-  uint32_t Index = 0;
+  uint32_t Index = 1;
   for (auto &Sym : Symbols)
----------------
Comment here please, to explain why we start from 1.


================
Comment at: tools/llvm-objcopy/Object.cpp:1142
+  if (Obj.SymbolTable)
+    Obj.SymbolTable->Size += Obj.SymbolTable->EntrySize;
+
----------------
It feels to me like we should actually have a virtual .size() method on the section base class (which will be overriden to return `Symbols.size() + 1` in this case), which wraps this sort of thing, hiding the details from assignOffsets (or its caller). Alternatively, can't this be done in the symbol table constructor/`initialize` method?


Repository:
  rL LLVM

https://reviews.llvm.org/D47414





More information about the llvm-commits mailing list