[PATCH] D32609: Update llvm-readobj -coff-resources to display tree structure.

Zachary Turner via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 3 10:04:46 PDT 2017


zturner added inline comments.


================
Comment at: llvm/include/llvm/Object/COFF.h:641-644
+  union {
+    support::ulittle32_t DataEntryOffset;
+    support::ulittle32_t SubdirOffset;
+  } Offset;
----------------
ecbeckmann wrote:
> ruiu wrote:
> > ecbeckmann wrote:
> > > ruiu wrote:
> > > > You are not using this class, are you? If you plan to use it later, please introduce this later.
> > > It is used in COFFDumper when we display each entry in a table.
> > So I do not see why you need this. Even if you need this, do you really need this many fields? They are actually pointing to the same memory location that is at offset 0 of a struct, so this is equivalent to
> > 
> >   union coff_resource_dir_entry {
> >     support::ulittle32_t NameOffset;
> >     support::ulittle32_t ID;
> >     support::ulittle32_t DataEntryOffset
> >     support::ulittle32_t SubdirOffset;
> >   };
> > 
> > It seems to me that you don't need to add this struct. You can just handle resource directory entry as a ulittle32_t, can't you?
> Hmm I'm not quite sure what you mean.  A resource directory entry is composed of two separate 32-bit fields, 64 bits in total.  The first field can be either an offset to a string name or an ID number.  The second field is either an offset to the resource data or the offset to the next directory level down the tree.  We must access both of these fields to read an entry, we might be able to use a ulittle32_t to read the first field and then move up 4 bytes to read the second field, but it seems much clearer to be able to access either of them by name.
I also agree that it's not an equivalent struct.  For @ruiu's proposed definition, `sizeof(coff_resource_dir_entry) == 4`, but for the one in the patch, `sizeof(coff_resource_dir_entry) == 8`.


================
Comment at: llvm/include/llvm/Object/COFF.h:627
 
+enum ResourceID {
+  RID_Cursor = 1,
----------------
I think this enum should go in `llvm/Support/COFF.h`.  It's confusing that there are two files named `COFF.h`, but this file is for object file layout stuff, and the other file is more semantic information and enum / constant definitions.


================
Comment at: llvm/include/llvm/Object/COFF.h:655
+    support::ulittle32_t ID;
+    StringRef getIDTypeName() const {
+      switch (ID) {
----------------
It's a little jarring to see methods in unions.  I think part of the reason is that when you're looking at a union, it's nice for it to be as compact as possible so that you can get a picture of the entire layout of the union in one screen.

Can you move all the methods out of the union and into the top level struct?

With that said, we don't normally put constant <-> string translation functions in low level headers like this.  Take a look at `COFFDumper.cpp`, you will see a lot of things like this:

```
static const EnumEntry<COFF::DLLCharacteristics> PEDLLCharacteristics[] = {
  LLVM_READOBJ_ENUM_ENT(COFF, IMAGE_DLL_CHARACTERISTICS_HIGH_ENTROPY_VA      ),
  LLVM_READOBJ_ENUM_ENT(COFF, IMAGE_DLL_CHARACTERISTICS_DYNAMIC_BASE         ),
  LLVM_READOBJ_ENUM_ENT(COFF, IMAGE_DLL_CHARACTERISTICS_FORCE_INTEGRITY      ),
  LLVM_READOBJ_ENUM_ENT(COFF, IMAGE_DLL_CHARACTERISTICS_NX_COMPAT            ),
  LLVM_READOBJ_ENUM_ENT(COFF, IMAGE_DLL_CHARACTERISTICS_NO_ISOLATION         ),
  LLVM_READOBJ_ENUM_ENT(COFF, IMAGE_DLL_CHARACTERISTICS_NO_SEH               ),
  LLVM_READOBJ_ENUM_ENT(COFF, IMAGE_DLL_CHARACTERISTICS_NO_BIND              ),
  LLVM_READOBJ_ENUM_ENT(COFF, IMAGE_DLL_CHARACTERISTICS_APPCONTAINER         ),
  LLVM_READOBJ_ENUM_ENT(COFF, IMAGE_DLL_CHARACTERISTICS_WDM_DRIVER           ),
  LLVM_READOBJ_ENUM_ENT(COFF, IMAGE_DLL_CHARACTERISTICS_GUARD_CF             ),
  LLVM_READOBJ_ENUM_ENT(COFF, IMAGE_DLL_CHARACTERISTICS_TERMINAL_SERVER_AWARE),
};
```

I think we should do something similar for the `ResourceID` enum.

Also, unless you're using a strongly typed enum (e.g. `enum class`), you shouldn't explicitly scope it by saying `ResourceID::`.  So you have to pick one of the following two patterns:

```
enum class ResourceID {
  Cursor,   // Names are not RID_ prefixed because you have to scope it with ResourceID::
  Bitmap,
  ...
};

enum ResourceID {
  RID_Cursor,   // Names are RID_ prefixed because you refer to them unqualified.
  RID_Bitmap,
  ...
};
```


================
Comment at: llvm/include/llvm/Object/COFF.h:703
+    }
+    uint32_t getNameOffset() const { return (~(1 << 31)) & NameOffset; }
+  } Identifier;
----------------
IIUC you are are taking the lower 31 bits of `NameOffset`?  Can you write this as:

```
return maskTrailingOnes<uint32_t>(31) & NameOffset;
```

This function is defined in `MathExtras.h`


================
Comment at: llvm/include/llvm/Object/COFF.h:710
+    bool isSubDir() const { return SubdirOffset >> 31; }
+    uint32_t value() const { return (~(1 << 31)) & SubdirOffset; }
+
----------------
Same here.


================
Comment at: llvm/include/llvm/Object/COFF.h:723-725
+  const coff_resource_dir_entry *entryAt(uint32_t Index) const {
+    return reinterpret_cast<const coff_resource_dir_entry *>(this + 1) + Index;
+  }
----------------
I don't think we should put this logic here.  Although in practice the two structures appear immediately after each other, in theory they should be independent of each other and the parsing code should be responsible for knowing this.


================
Comment at: llvm/include/llvm/Object/COFF.h:1146
+  ResourceSectionRef() = default;
+  ResourceSectionRef(const StringRef Ref) : Ref(Ref) {}
+
----------------
No need for the argument to be `const`.  Also can you mark this constructor `explicit`?


================
Comment at: llvm/include/llvm/Object/COFF.h:1155
+private:
+  const StringRef Ref;
+
----------------
Instead of storing a `StringRef`, store a `BinaryByteStream`.  This way you don't have to reconstruct it every time you want to read from it.  And it's just as good as a `StringRef`


================
Comment at: llvm/lib/Object/COFFObjectFile.cpp:1597-1602
+ErrorOr<const coff_resource_dir_table *>
+ResourceSectionRef::getTableAtOffset(uint32_t Offset) {
+  auto Table = reinterpret_cast<const coff_resource_dir_table *>(
+      reinterpret_cast<const uint8_t *>(Ref.data()) + Offset);
+  return Table;
+}
----------------
Is this function ever called?


================
Comment at: llvm/lib/Object/COFFObjectFile.cpp:1614
+  StringRef RawDirString;
+  Reader.setOffset(Offset + 2);
+  // Strings are stored as 2-byte aligned unicode characters but readFixedString
----------------
What's the +2 for?


================
Comment at: llvm/tools/llvm-readobj/COFFDumper.cpp:1549-1550
+
+void COFFDumper::printResourceDirectoryTable(
+    ResourceSectionRef &RSF, const coff_resource_dir_table *Table,
+    StringRef Level) {
----------------
Should `RSF` be passed by value?  It's immutable in principle anyway since all it contains is a `StringRef`, and it's thin as well.  Usually when classes are suffixed with `Ref`, LLVM convention is that we should pass them by value since they're "as good as references"


https://reviews.llvm.org/D32609





More information about the llvm-commits mailing list