[Lldb-commits] [PATCH] D83881: [lldb/COFF] Remove strtab zeroing hack

Martin Storsjö via Phabricator via lldb-commits lldb-commits at lists.llvm.org
Wed Jul 15 11:43:09 PDT 2020


mstorsjo added a comment.

In D83881#2153687 <https://reviews.llvm.org/D83881#2153687>, @amccarth wrote:

> Yes, getting rid of this hack looks like a good idea.  If it was actually necessary, there should have been a test on it, and the comments should have been clearer.


Well in general, I wouldn't agree with that logic - especially if the test coverage for PECOFF in lldb has been weak to begin with. However I do agree with the conclusion that we can try to get rid of it.

So with COFF symbol tables, you can either have a <= 8 chars symbol name embedded in the struct, or have an offset into the string table (where the first 4 bytes of the string table contains the length of the string table). Now the existing code seems to imply that one potentially could have a symbol with an all-zeros symbol name field (offset), which according to this code should be interpreted as an offset into the string table (but without the current hack would end up interpreting the binary size field as text).

I haven't heard of such symbol table entries, and `COFFObjectFile::getSymbolName`, https://github.com/llvm/llvm-project/blob/master/llvm/lib/Object/COFFObjectFile.cpp#L994-L997, seems to just blindly call `COFFObjectFile::getString()`, https://github.com/llvm/llvm-project/blob/master/llvm/lib/Object/COFFObjectFile.cpp#L980-L986, which doesn't seem to have a special case for that. So if there are object files out there with this pattern, COFFObjectFile wouldn't be able to handle them correctly either.

So with that in mind, I agree that it probably is ok to remove this hack.



================
Comment at: lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp:642
         DataExtractor symtab_data =
             ReadImageData(m_coff_header.symoff, symbol_data_size + 4);
         lldb::offset_t offset = symbol_data_size;
----------------
amccarth wrote:
> The `+4` at the end of this expression is from the same patch.  I wonder if it was an attempt to make space for the four bytes of zeros at offset 0 that you're eliminating?
> 
> I suggest removing the `+4` and trying the tests again unless it's obvious to you why it's still necessary.  The comment above seems like it might be trying to explain it, but that comment came later.
No, the `+4` here was present before 0076e71592a6a (if viewing that commit, view it with a bit more than the default git context size and you'll find "// Include the 4 bytes string table size at the end of the symbols" existing already before that.

The +4 here can't be eliminated; without it, the `const uint32_t strtab_size = symtab_data.GetU32(&offset)` two lines below would be out of range. So this first reads the symbol table and the 4 byte size of the string table, and if the string table turns out to be nonzero in size, it also loads a separate DataExtractor with the string table contents, with the two DataExtractors overlapping for that size field. It doesn't have anything to do with overwriting the start, just with the code layout in general.


================
Comment at: lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp:672
                 break;
               symbol_name.assign(symbol_name_cstr, sizeof(symbol.name));
             }
----------------
Kind of unrelated question: Does this treat the assigned symbol as always 8 bytes long, or does it scan the provided 8 bytes for potential trailing terminators? Object/COFFObjectFile has got a bit of extra logic to distinguish between those cases: https://github.com/llvm/llvm-project/blob/master/llvm/lib/Object/COFFObjectFile.cpp#L999-L1004


================
Comment at: lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp:689
               i += symbol.naux;
               offset += symbol_size;
             }
----------------
Unrelated to the patch at hand: I believe this should be `offset += symbol.naux * symbol_size;`. I could try to make a patch to exercise that case at some later time...


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83881/new/

https://reviews.llvm.org/D83881





More information about the lldb-commits mailing list