[PATCH] D111889: [AIX] Support of Big archive (read)

James Henderson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 19 00:56:26 PDT 2021


jhenderson added a comment.

I'm quite overburdened with too many reviews at the moment, but I've given this a look through to get you started.

In general, this looks reasonable. I'm somewhat surprised you only have to modify llvm-ar though. I'd expect several other tools needing to be modified to cope with either archive type now.



================
Comment at: llvm/include/llvm/Object/Archive.h:35
 
+constexpr int ArchiveMaigcLen = 8;
+
----------------
I'd like to get rid of this constant, since it's not guaranteed for all conceptual archive types. Also there's a typo in it: `ArchiveMagicLen`.

See my below comments for how I'd get rid of it.


================
Comment at: llvm/include/llvm/Object/Archive.h:53
 
+  /// Raw access and helper getters
+  virtual StringRef getRawAccessMode() const = 0;
----------------
I don't think this sort of function category comments are useful, so delete it. It's clear from the function names that they are raw getters.


================
Comment at: llvm/include/llvm/Object/Archive.h:59
+
+  // Non-Raw getters
   Expected<sys::fs::perms> getAccessMode() const;
----------------
Ditto. Unhelpful comment (also inconsistent styling).


================
Comment at: llvm/include/llvm/Object/Archive.h:66
 
-  // This returns the size of the private struct ArMemHdrType
-  uint64_t getSizeOf() const { return sizeof(ArMemHdrType); }
+  // Returns the size of the private struct ArMemHdrType
+  virtual uint64_t getSizeOf() const = 0;
----------------
I think better might be "Returns the size in bytes of the format-defined header." or something similar. There's no particular reason why one concrete member header type needs to have a private struct, so this is leaking implementation details. Indeed, a future concrete version may not have headers at all, in the real file, so this might return 0 for that version.


================
Comment at: llvm/include/llvm/Object/Archive.h:129
+  Expected<uint64_t> getNextOffset() const;
+  static bool classof(AbstractArchiveMemberHeader const *v);
+
----------------
Alternative names also welcome, but not a single lower-case letter that has no relevance to the thing.


================
Comment at: llvm/include/llvm/Object/Archive.h:142-145
+    union {
+      char Name[2]; // Start of member name
+      char Terminator[2];
+    };
----------------
Why do you need a union here? Could you not simply either use `Name` direectly in the raw header, or call it `NameOrTerminator`? Is this even really a part of the on-disk archive member header?


================
Comment at: llvm/include/llvm/Object/Archive.h:151
 
 class Archive : public Binary {
   virtual void anchor();
----------------
Rather than be a potential concrete type, I think it would be cleaner if `Archive` were an abstract type, with concrete implementations for Big and traditional (better name required) archives. With Archive being a concrete type, there's a risk of slicing when someone thinks they've got a traditional archive, but actually have a BigArchive.

This then allows some somewhat better (in my opinion) locations for certain things.


================
Comment at: llvm/include/llvm/Object/Archive.h:176
+    }
+    Child(Child &&C) {
+      Parent = std::move(C.Parent);
----------------
Add blank lines either side of multi-line functions. Applies here and with the function above.


================
Comment at: llvm/include/llvm/Object/Archive.h:347
 
-  // Cast methods.
+  /// Cast methods.
   static bool classof(Binary const *v) { return v->isArchive(); }
----------------
I'd vote for deleting this comment rather than modifying it. It's not useful.


================
Comment at: llvm/include/llvm/Object/Archive.h:358
   uint32_t getNumberOfSymbols() const;
+  virtual uint64_t getFirstChildeOffset() const { return ArchiveMaigcLen; }
 
----------------
Related to my other comment, I think it would make sense for `ArchiveMagicLen` to be inside each concrete class, and returned via a virtual getter.


================
Comment at: llvm/include/llvm/Object/Archive.h:382
+  struct BigArFixLenHdrType {
+    char Magic[ArchiveMaigcLen]; ///< Big archive magic string.
+    char MemOffset[20];          ///< Offset to member table.
----------------
This length should be derived from the length of the magic string, rather than some additional constant (unless that constant is of course derived from the magic string's length).


================
Comment at: llvm/include/llvm/Object/Archive.h:392
+
+  const BigArFixLenHdrType *ArFixLenHdr = nullptr;
+  uint64_t FirstArOffset = 0;
----------------
Is an `ArFixLenHdr` optional? If not, don't bother setting it to nullptr, and just set it during the constructor.


================
Comment at: llvm/include/llvm/Object/Archive.h:393-394
+  const BigArFixLenHdrType *ArFixLenHdr = nullptr;
+  uint64_t FirstArOffset = 0;
+  uint64_t LastArOffset = 0;
+
----------------
The names of the two members don't make sense to me - they seem liks "First Archive offset" which is clearly not what they mean. They need better names. I'm thinking they should actually be `FirstMemberOffset` and `LastMemberOffset` or something similar.


================
Comment at: llvm/include/llvm/Object/Archive.h:399
+  uint64_t getFirstChildeOffset() const override { return FirstArOffset; }
+  uint64_t getLastArOffset() const { return LastArOffset; }
+};
----------------
Same comment as above - rename the function to match. Perhaps `getLastChildOffset` would make the most sense, since it mirrors the other's name.


================
Comment at: llvm/lib/Object/Archive.cpp:1195
+  if (RawStringRef.rtrim(' ').getAsInteger(10, FirstArOffset)) {
+    report_fatal_error("Malformed AIX Big Archive: First Archive Offset \"" +
+                       RawStringRef + "\" is not a number.");
----------------
1) Ifs that contain only a single statement shouldn't have braces.
2) Why is this a report_fatal_error call rather than setting the input `Err` variable?
3) Please follow the LLVM standards for error messages (https://llvm.org/docs/CodingStandards.html#error-and-warning-messages).
4) Don't capitalize "First Archive Offset". Use either the actual field name as per the spec, or just make it all lower-case.
5) Don't use "Archive" here in the "First Archive Offset". Use "member" or "child" instead. The archive doesn't consist of other archives (normally!). I think the spec has used a misleading name, so let's not follow it unless we're using the term as literally stated in the spec (`fl_fstmoff`).


================
Comment at: llvm/lib/Object/Archive.cpp:1204
+  if (RawStringRef.getAsInteger(10, LastArOffset)) {
+    report_fatal_error("Malformed AIX Big Archive: Last Archive Offset \"" +
+                       RawStringRef + "\" is not a number.");
----------------
Same comments as above.


================
Comment at: llvm/lib/Object/ArchiveWriter.cpp:140-141
     return true;
+  case object::Archive::K_AIXBIG:
+    report_fatal_error("Not handled yet");
   case object::Archive::K_COFF:
----------------
I don't think there's any point in this addition yet, (unless it silences a warning): there's already an equivalent `llvm_unreachable` statement below.


================
Comment at: llvm/lib/Object/ArchiveWriter.cpp:208-209
     return true;
+  case object::Archive::K_AIXBIG:
+    llvm_unreachable("Not handled yet");
   }
----------------
Ditto.


================
Comment at: llvm/test/Object/archive-big-extract.test:1
+## Test extracting an archive created by AIX ar (Big Archive).
+RUN: llvm-ar p %p/Inputs/bigFormat.a evenlen | FileCheck %s --check-prefix=AIXBIG --implicit-check-not={{.}}
----------------
This isn't "extracting" it - it is "printing" it. The "x" option is extraction of a member (you may wish to have a test case for that option too).


================
Comment at: llvm/test/Object/archive-big-read.test:3-5
+AIXBIG:      rw-r--r-- 0/0     19 Jan  1 00:00 1970 evenlen
+AIXBIG-NEXT: rw-r--r-- 0/0      7 Jan  1 00:00 1970 oddlen
+AIXBIG-NEXT: rw-r--r-- 0/0    258 Jan  1 00:00 1970 empty.o
----------------
I'm guessing the numbers in the column before the date are the sizes, although I don't know. If they are the size fields, the sizes don't match what the names imply they should (i.e. even/odd/zero).


================
Comment at: llvm/tools/llvm-ar/llvm-ar.cpp:1008
+        object::Archive::create(Buf.get()->getMemBufferRef());
+    if (!ArchiveOrError) {
+      failIfError(ArchiveOrError.takeError(),
----------------
No braces for single-line ifs.


================
Comment at: llvm/tools/llvm-ar/llvm-ar.cpp:1013
+
+    auto Archive = std::move(ArchiveOrError.get());
+
----------------
Don't use auto, where the type isn't obvious from the immediate context (i.e. this line).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111889/new/

https://reviews.llvm.org/D111889



More information about the llvm-commits mailing list