[PATCH] D111889: [AIX] Support of Big archive (read)
James Henderson via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Dec 10 02:26:38 PST 2021
jhenderson added a comment.
I've spent far more time than I've got reviewing this patch today. I'll have to leave reviewing it further for a while yet.
================
Comment at: llvm/include/llvm/Object/Archive.h:43
-class ArchiveMemberHeader {
+class AbstractArchiveMemberHeader {
+protected:
----------------
Whilst looking at the later points, it occurred to me that we could solve some of the duplication, by doing something like the following:
# Change `AbstractArchiveMemberHeader` to take a template parameter, namely the underlying ArMemHdrType used in the subclasses.
# Have the concrete classes pass in their private member type as the template parameter.
# Create a new base class, that AbstractArchiveMemberHeader<T> inherits from.
# Push the virtual interface into that class.
# Move the functions that are duplicated in the two subclasses into the templated class. The subclasses should then only contain the sets of functions that actually have to be different, as opposed to just needing to be different due to neeeding to have slightly different template classes.
What do you think? Going with this approach, I'd suggest names like `ArMemberHeaderInterface`, and `AbstractArMemberHeader`, but other name ideas are welcome.
================
Comment at: llvm/include/llvm/Object/Archive.h:63-65
+ /// Returns the size in bytes of the format-defined header of the concrete
+ /// archive type.
+ virtual uint64_t getSizeOf() const = 0;
----------------
I'm not sure why you've moved this function: it's not really anything to do wtih a member function's properties. Put it back where it was in the original source code.
================
Comment at: llvm/include/llvm/Object/Archive.h:134-137
+ uint64_t getSizeOf() const override { return sizeof(ArMemHdrType); }
+ Expected<const char *> getNextChildLoc() const override;
+ Expected<uint64_t> getNextOffset() const;
+ virtual Expected<bool> isThin() const override { return false; }
----------------
These functions aren't properties of the member header (i.e. they don't correspond to fields in that header). I'd keep them separate from the other batch, with a blank line. Same goes for the regular archive version.
================
Comment at: llvm/include/llvm/Object/Archive.h:180
+ Child(const Child &C)
+ : Parent(C.Parent), Data(C.Data), StartOfFile(C.StartOfFile) {
+ if (C.Header)
----------------
Please remember to clang-format your changes.
================
Comment at: llvm/lib/Object/Archive.cpp:51
+void createMemberHeaderParseError(
+ const AbstractArchiveMemberHeader *ArcMemHeader, const char *RawHeaderPtr,
+ uint64_t Size, Error *Err) {
----------------
"Ar" is a more common abbreviation for "Archive" than "Arc"
================
Comment at: llvm/lib/Object/Archive.cpp:111
+
+ ArMemHdr = reinterpret_cast<const ArMemHdrType *>(RawHeaderPtr);
+
----------------
Why is this not done in the initializer list, like the regular archive header class?
================
Comment at: llvm/lib/Object/Archive.cpp:257
+Expected<uint64_t>
+GetArchiveMemberDecField(Twine FieldName, const StringRef RawField,
+ const Archive *Parent,
----------------
Please run clang-tidy on your code changes, as I'm finding a number of mistakes like this that should have been caught before this patch was put up for review.
================
Comment at: llvm/lib/Object/Archive.cpp:275
+Expected<uint64_t>
+GetArchiveMemberOctField(Twine FieldName, const StringRef RawField,
+ const Archive *Parent,
----------------
================
Comment at: llvm/lib/Object/Archive.cpp:292
-Expected<sys::TimePoint<std::chrono::seconds>>
-ArchiveMemberHeader::getLastModified() const {
- unsigned Seconds;
- if (StringRef(ArMemHdr->LastModified, sizeof(ArMemHdr->LastModified))
- .rtrim(' ')
- .getAsInteger(10, Seconds)) {
- std::string Buf;
- raw_string_ostream OS(Buf);
- OS.write_escaped(
- StringRef(ArMemHdr->LastModified, sizeof(ArMemHdr->LastModified))
- .rtrim(" "));
- OS.flush();
- uint64_t Offset =
- reinterpret_cast<const char *>(ArMemHdr) - Parent->getData().data();
- return malformedError("characters in LastModified field in archive header "
- "are not all decimal numbers: '" +
- Buf + "' for the archive member header at offset " +
- Twine(Offset));
+#define getFieldRawString(T) StringRef(T, sizeof(T)).rtrim(" ")
+
----------------
Use a (potentially templated) function, not a macro. This macro does nothing beneficial that a function can't do.
================
Comment at: llvm/lib/Object/Archive.cpp:56
+ uint64_t Size, Error *Err) {
+ ErrorAsOutParameter ErrAsOutParam(Err);
+ if (Err) {
----------------
DiggerLin wrote:
> jhenderson wrote:
> > Is this safe and correct given that this has been done at the start of the calling code too?
> I can not get the comment , I am appreciate that if you can you explain more detail on it.
I'm not too familiar with how `ErrorAsOutParameter` works, but the calling site for this function also has an `ErrorAsOutParameter`. As a result, the `Err` has been put inside more than one of these objects, which seems suspicious to me. It may not be safe (e.g. it might assert).
That being said, why not just have this function return an Error always, and go back to checking whether the function returned an `Error::success()` at the call sites? The name of this function implies an `Error` is created always.
================
Comment at: llvm/lib/Object/Archive.cpp:161
+Expected<StringRef> BigArchiveMemberHeader::getRawName() const {
+ StringRef::size_type NameSize = strtol(ArMemHdr->NameLen, NULL, 10);
+ return StringRef(ArMemHdr->Name, NameSize);
----------------
jhenderson wrote:
> Also consider adding comments to name the nullptr and 10 values, e.g. `strtol(ArMemHdr->NameLen, /*endptr=*/nullptr, /*base=*/10);`
>
> The regular archive kind has various safety checks to make sure the number read makes sense. For example, it checks to make sure there's actually a number in the field. We also need to show that the name's start and end are within the archive buffer, otherwise we'll get crashes/reading past the end of the file etc.
> The regular archive kind has various safety checks to make sure the number read makes sense. For example, it checks to make sure there's actually a number in the field. We also need to show that the name's start and end are within the archive buffer, otherwise we'll get crashes/reading past the end of the file etc.
This hasn't been addressed. Why not?
================
Comment at: llvm/lib/Object/Archive.cpp:342-352
+uint64_t ArchiveMemberHeader::getOffset() const {
+ uint64_t Offset =
+ reinterpret_cast<const char *>(ArMemHdr) - Parent->getData().data();
+ return Offset;
+}
+
+uint64_t BigArchiveMemberHeader::getOffset() const {
----------------
DiggerLin wrote:
> jhenderson wrote:
> > These two functions are identical. Could they be pushed into the base class? You could pass in the pointer to get the offset from.
> it can not pushed into the base class and pass in the pointer to get offset from.
>
> When you call from the Child , for example
> Header->getOffset()
> Header is AbstractArchiveMemberHeader, it do not have ArMemHdr
Apologies, I think you misunderstood me, possibly I wasn't clear enoguh.
How about the following concrete points:
1) Add a virtual function to `AbstractArchiveMemberHeader` called `getData()` or similar.
2) Implement this concrete function in the two subclasses, to return `reinterpret_cast<const char *>(ArMemHdr)`.
3) Don't make `getOffset` a virtual function, and instead implement it solely in the base class as follows:
```
uint64_t AbstractArchiveMemberHeader::getOffset() const {
return getData() - Parent->getData().data();
}
```
================
Comment at: llvm/lib/Object/Archive.cpp:379-387
+StringRef ArchiveMemberHeader::getRawLastModified() const {
+ return StringRef(ArMemHdr->LastModified, sizeof(ArMemHdr->LastModified))
+ .rtrim(' ');
+}
+
+StringRef BigArchiveMemberHeader::getRawLastModified() const {
+ return StringRef(ArMemHdr->LastModified, sizeof(ArMemHdr->LastModified))
----------------
DiggerLin wrote:
> jhenderson wrote:
> > As above - pull into a common function. This looks basically identical to the getRawAccessMode function, so you should be able to do something to avoid duplicating the logic (maybe use a template for the LastModified's/AccessMode's type if needed.
> I have think over to use a common function before. but
> for the definition of ArMemHdr is different in ArchiveMemberHeader and BigArchiveMemberHeader , it is difficult to put into a common.
>
> a lot of duplicated code are caused by the same reason.
> If change the name of ArMemHdr in the BigArchiveMemberHeader to "BigArMemHdr " , it maybe be more easy to understand the reason.
I've got no issue with `BigArMemHdr` as a name, if you prefer it. I don't think it makes a great deal of difference to my points though.
================
Comment at: llvm/lib/Object/Archive.cpp:745-750
// Make sure Format is initialized before any call to
- // ArchiveMemberHeader::getName() is made. This could be a valid empty
- // archive which is the same in all formats. So claiming it to be gnu to is
- // fine if not totally correct before we look for a string table or table of
- // contents.
+ // ArchiveMemberHeader::getName() is made. Read Magic to deal with AIX Big
+ // Archive. Else, this could be a valid empty archive which is the same in all
+ // formats. So claiming it to be gnu to is fine if not totally correct before
+ // we look for a string table or table of contents.
Format = K_GNU;
----------------
DiggerLin wrote:
> jhenderson wrote:
> > Stare at this code a minute and see if you can spot the bug... (hint, what is the value of `Format` before and after if this is a BigArchive?)
>
>
> ```
> } else if (Buffer.startswith(BigMagic)) {
> Format = K_AIXBIG;
> IsThin = false;
> return;
> }
> ```
>
> for BigArchive, there is "return;" in the BigArchive, it will not come here.
Fair point, but then the addition to the comment is either wrong or in the wrong location, since it talks about Big Archives, but is referring to code that happens earlier.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D111889/new/
https://reviews.llvm.org/D111889
More information about the llvm-commits
mailing list