[PATCH] D85774: [XCOFF][AIX] Enable tooling support for 64 bit symbol table parsing

James Henderson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Aug 14 00:47:05 PDT 2020


jhenderson added a subscriber: grimar.
jhenderson added a comment.

In D85774#2216424 <https://reviews.llvm.org/D85774#2216424>, @jasonliu wrote:

>   I just wonder whether we can implement two separate structure XCOFFSymbolEntry32 and XCOFFSymbolEntry64 without so much union be used on currently implement.
>
> Agree that current implementation have many union, and it's hard for people to parse what exactly is inside for the structure.
> But separating them into two structures, namely, XCOFFSymbolEntry32 and XCOFFSymbolEntry64, would mean a lot more `if (Obj->is64Bit())` check across all tooling, which sacrifice a lot in the usability department. A lot more logic would look duplicated.
> One potential solution I thought about is for every data member, we introduce a getter to retrieve the data, and mark the data members private.  So that most of the time, user of the structure do not need to look inside of the structure to figure out how to retrieve certain data. But the downside is we are going to introduce a lot of getters for that, and not sure if it would be worth the effort.

It seems to me like this should be using inheritance here. You have a base class that has the common members, and provides pure virtual declarations of the various getters, with the sub-classes defining them to do the right thing. Yes, it would introduce a number of getters, but I feel like it would make everything a bit cleaner from a usability standpoint. In most cases, you then don't need any `is64Bit` queries, because the getters hide that from you.

On a testing note, there are several places in the new code which detect some kind of error. You need testing for these code paths too.



================
Comment at: llvm/lib/Object/XCOFFObjectFile.cpp:419
+      is64Bit() ? getNumberOfSymbolTableEntries64()
+                : getLogicalNumberOfSymbolTableEntries32();
+  SymDRI.p =
----------------
jasonliu wrote:
> DiggerLin wrote:
> > can we add new member function as getNumberOfSymbolTableEntries()
> > {
> >   return is64Bit() is64Bit() ? getNumberOfSymbolTableEntries64()
> >                 : getLogicalNumberOfSymbolTableEntries32();
> > }
> > 
> > the function can also use in 
> > XCOFFObjectFile::create()
> > and
> > getSymbolNameByIndex()
> About all the comments mentioning if we could combining the 32bit and 64 bit version into 1 function.
> I don't think it's good idea because people would ignore the fact that they are returning different types underneath. 
>From my experience working with tools that had to support 32-bit and 64-bit ELF, you don't worry about the underlying type in most cases and always use the larger type. The same probably applies here. Of course, it becomes a bit moot if you add a common getter interface as suggested out-of-line, because those getters will have to return the larger of the two return types anyway.

Is there a strong reason to not use the larger type everywhere?


================
Comment at: llvm/lib/Object/XCOFFObjectFile.cpp:744
+  CurOffset = Obj->is64Bit() ? Obj->getSymbolTableOffset64()
+                             : Obj->getSymbolTableOffset32();
+  uint64_t SymbolTableSize = static_cast<uint64_t>(sizeof(XCOFFSymbolEntry)) *
----------------
jasonliu wrote:
> DiggerLin wrote:
> > can we a member function as  
> > uint64_t XCOFFObjectFile::getSymbolTableOffset() const {
> >   return is64Bit() ? fileHeader64()->SymbolTableOffset
> >                    : fileHeader32()->SymbolTableOffset;
> > }
> > add
> > CurOffset = getSymbolTableOffset() here;
> > 
> > 
> getSymbolTableOffset64 and getSymbolTableOffset32 returns different types. 
> Combine them to return the same type is likely to introduce error when caller use them. 
Related to my comments elsewhere - it looks to me like most consumers will need to handle both 32 and 64-bit versions, so they'll always have to do this dance. Thus your concern about how the caller uses them is misplaced - the caller is more likely to do the wrong thing i.e. call the wrong version than have problems with the return types.


================
Comment at: llvm/lib/Object/XCOFFObjectFile.cpp:780
+  if (!hasCsectAuxEnt())
+    return errorCodeToError(object_error::parse_failed);
 
----------------
Better than `errorCodeToError(/*some error code*/)` is to use `createStringError()` or `createFileError()` to provide more context to the failure (how did the parsing fail? where? etc).


================
Comment at: llvm/lib/Object/XCOFFObjectFile.cpp:798
+    uintptr_t AuxAddr64 = 0;
+    for (int Index = 1; Index <= getNumberOfAuxEntries(); ++Index) {
+      AuxAddr64 =
----------------
I don't think you want to use `int` here. There's always going to be a positive number of entries, and there are no subtractions etc inolving `Index` here. Better would be an unsigned type of some form (presumably the return type of `getNumberOfAuxEntries()`).


================
Comment at: llvm/lib/Object/XCOFFObjectFile.cpp:807
+    if (AuxAddr64 == AuxAddr)
+      return errorCodeToError(object_error::parse_failed);
+
----------------
Same comment as before - use `createStringError()` or `createFileError()`.


================
Comment at: llvm/test/tools/llvm-objdump/XCOFF/disassemble-symbol-description64.test:13-14
+
+## xcoff-section-headers64.o Compiled with IBM XL C/C++ for AIX, V16.1.0
+## compiler command: xlc -q64 -qtls -o xcoff-section-headers64.o -c test.c
+
----------------
I'm not going to stop you checking in a pre-compiled object, as I'm not an XCOFF maintainer, but as you are continuing to add more functionality here, I strongly advise you to write a yaml2obj XCOFF port, to avoid pre-canned binaries. You'll find pre-built binaries extremely inconvenient to work with as you maintain things going forward. Not only that, but they are harmful to the git repository size, especially if you have to occasionally rebuild them.

Using yaml2obj may also be about the only way you can test most parse failure paths.

If yaml2obj isn't viable, at least consider llvm-mc or similar, if possible.


================
Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:393
 
-    // The symbol's last auxiliary entry is a CSECT Auxiliary Entry.
-    printCsectAuxEnt32(XCOFFSymRef.getXCOFFCsectAuxEnt32());
+    printCsectAuxEnt(unwrapOrError(Obj.getFileName(), XCOFFSymRef.getXCOFFCsectAuxEnt()));
     break;
----------------
@grimar has gone to a lot of effort to get rid of `unwrapOrError` from the ELF dumping code. I'd prefer it if we could avoid using it here too. It is generally better in dumping tools to report a warning and abort dumping the current section than to emit an error and terminate the program, since it gives the user more of the information they've asked for.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85774/new/

https://reviews.llvm.org/D85774



More information about the llvm-commits mailing list