[llvm-commits] [lld] Support reading archive libraries
Shankar Easwaran
shankare at codeaurora.org
Fri Nov 2 15:22:43 PDT 2012
Hi Michael,
Thanks for the review.
> Needs to follow llvm naming standard, and Archive is redundant.
> http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly
>
> K_GNU,
> K_BSD,
> K_COFF
Will change it accordingly.
> + child_iterator begin_archiveidx() const;
> This is never defined.
Will remove this.
>> +
>> symbol_iterator begin_symbols() const;
>> symbol_iterator end_symbols() const;
>>
>> @@ -136,6 +148,7 @@
>> private:
>> child_iterator SymbolTable;
>> child_iterator StringTable;
>> + Kind _kind;
> Follow llvm naming conventions.
Sure.
>
>
> + cl::opt<bool> ArchiveIndex("archive-index",
> + cl::desc("Print the archive index"));
> My version of nm names this --print-armap.
Will change this.
+ if (ArchiveIndex) {
+ outs() << "archive index" << "\n";
> My version of nm prints this with a capital A.
>
>> + for (object::Archive::symbol_iterator i = a->begin_symbols(), e = a->end_symbols(); i != e; ++i) {
> 80 col.
Will change to capital A, and align to 80 columns.
> + + // If the archive is a GNU archive, then the string table would
> have + // the name and /, if the filename exists in the string table.
> it would only + // exist in the string table if the filename is a long
> filename
> I don't understand what this comment is trying to say. Are you refering to all
> names in a GNU archive ending with /?
Yes.
>> + if (Parent->kind() == kindGNUArchive) {
>> + StringRef::size_type end = StringRef(addr).find('/');
>> + Result = StringRef(addr, end);
>> + }
>> + else {
> No newline before else.
Sure.
>
>> + Result = addr;
>> + }
>> return object_error::success;
>> } else if (name.startswith("#1/")) {
>> APInt name_size;
>> @@ -187,15 +197,36 @@
>> child_iterator i = begin_children(false);
>> child_iterator e = end_children();
>>
>> - if (i != e) ++i; // Nobody cares about the first member.
>> - if (i != e) {
> There should be a comment here explaining that there are different formats and
> how we detect the difference.
Ok.
>> + StringRef name;
>> + if ((ec = i->getName(name)))
>> + return;
>> + if (name == "/") {
>> + // If the archive format is a GNU archive
>> + // the first member is important, it would be the symbol table
> This is wrong. The first member in COFF is also /.
The format which Archive was handling, was already handling COFF I
believe. I have added changes mainly
for reading GNU archives.
>
>> + _kind = kindGNUArchive;
>> SymbolTable = i;
>> - ++i;
>> - }
>> - if (i != e) {
>> StringTable = i;
> Why is StringTable set here? The string table and SymbolTable aren't the same
> in GNU ar.
It is, there is a seperate string table, only if there are long filenames.
>
>> + if (i != e) ++i;
>> + if ((ec = i->getName(name)))
>> + return;
>> + if (name == "//")
>> + StringTable = i;
> As far as I can tell, GNU ar does not require / and // to be the first and
> second members. It would be great if we could find some form of spec for this.
I cant find the spec for the GNU ar, but as far as I tested, they
appeared consecutively.
>> + } else if (name == "__.SYMDEF") {
>> + _kind = kindBSDArchive;
>> + SymbolTable = i;
>> + StringTable = i;
> StringTable should just be set to e for BSD.
Ok.
>
>> + } else {
>> + _kind = kindCOFFArchive;
>> +
>> + if (i != e) ++i; // Nobody cares about the first member.
>> + if (i != e) {
>> + SymbolTable = i;
>> + ++i;
>> + }
>> + if (i != e) {
>> + StringTable = i;
>> + }
>> }
>> -
>> ec = object_error::success;
>> }
>>
>> @@ -222,17 +253,37 @@
>>
>> error_code Archive::Symbol::getMember(child_iterator &Result) const {
>> const char *buf = Parent->SymbolTable->getBuffer()->getBufferStart();
>> - uint32_t member_count = *reinterpret_cast<const support::ulittle32_t*>(buf);
>> const char *offsets = buf + 4;
>> - buf += 4 + (member_count * 4); // Skip offsets.
>> - const char *indicies = buf + 4;
>>
>> - uint16_t offsetindex =
>> - *(reinterpret_cast<const support::ulittle16_t*>(indicies)
>> - + SymbolIndex);
>> + uint32_t member_count = 0;
>> + uint16_t offsetindex = 0;
>> + uint32_t offset = 0;
> These should follow the llvm naming convention. I realize quite a bit of the
> local variables in this file don't, but added stuff should.
>
> Also, these should be declared where they are used.
>
>> - uint32_t offset = *(reinterpret_cast<const support::ulittle32_t*>(offsets)
>> + // If the archive type is a GNU archive
>> + // the fields would be
>> + // member count
>> + // symbol offset
>> + // symbol offset
>> + // ...
>> + // symbol name\0
>> + // symbol name\0
>> + // ...
> This comment doesn't really help with understanding the code.
its describing the format, maybe I will add more text to that.
>
>> + if (Parent->kind() == kindGNUArchive) {
>> + member_count = *reinterpret_cast<const support::ubig32_t*>(buf);
> member_count is never used.
>
>> + offsetindex = SymbolIndex;
> offset index is only used once.
Will remove member_count and replace offset index with SymbolIndex.
>
>> + offset = *(reinterpret_cast<const support::ubig32_t*>(offsets)
>> + + offsetindex);
>> + }
>> + else {
> No newline before else.
Ok.
>
>> + member_count = *reinterpret_cast<const support::ulittle32_t*>(buf);
>> + buf += 4 + (member_count * 4); // Skip offsets.
> Use sizeof(uint32_t) instead to make it clear what 4 means.
Ok.
>> + const char *indicies = buf + 4;
>> + offsetindex =
>> + *(reinterpret_cast<const support::ulittle16_t*>(indicies)
>> + + SymbolIndex);
>> + offset = *(reinterpret_cast<const support::ulittle32_t*>(offsets)
>> + (offsetindex - 1));
> You need to verify that these pointers are within the bounds of the
> symbol table.
Ok.
>
>> + }
> What about BSD?
BSD is not complete, do you want me to go ahead and remove the part of
the functionality that is there for BSD ?
>
>> const char *Loc = Parent->getData().begin() + offset;
>> size_t Size = sizeof(ArchiveMemberHeader) +
>> @@ -253,10 +304,21 @@
>>
>> Archive::symbol_iterator Archive::begin_symbols() const {
>> const char *buf = SymbolTable->getBuffer()->getBufferStart();
>> - uint32_t member_count = *reinterpret_cast<const support::ulittle32_t*>(buf);
>> - buf += 4 + (member_count * 4); // Skip offsets.
>> - uint32_t symbol_count = *reinterpret_cast<const support::ulittle32_t*>(buf);
>> - buf += 4 + (symbol_count * 2); // Skip indices.
>> + uint32_t member_count = 0;
>> + uint32_t symbol_count = 0;
> Naming convention and declare when used.
Ok.
>
>> + // If the archive type is a GNU archive
>> + // symbol count is equal to member count
>> + if (kind() == kindGNUArchive) {
>> + member_count = *reinterpret_cast<const support::ubig32_t*>(buf);
>> + symbol_count = *reinterpret_cast<const support::ubig32_t*>(buf);
> Don't read the same thing twice. And member_count isn't used.
Ok.
>
>> + buf += 4 + (symbol_count * 4);
>> + }
>> + else {
> No newline before else.
Ok.
>
> You don't need to duplicate all the test infrastructure for llvm-nm. Add your
> tests to test/Object.
Sure, will remove the test framework for nm.
Thanks
Shankar Easwaran
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation
More information about the llvm-commits
mailing list