[PATCH] D100651: [AIX] Support of Big archive (read)

Guesnet via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 15 03:28:17 PDT 2021


EGuesnet added inline comments.


================
Comment at: llvm/lib/Object/Archive.cpp:1013
+    return child_end();
+  const char *Loc = Data.getBufferStart() + strlen(Magic) + sizeof(Archive::ArFixLenHdrType);
+  Child C(this, Loc, &Err);
----------------
Esme wrote:
> EGuesnet wrote:
> > Esme wrote:
> > > EGuesnet wrote:
> > > > Esme wrote:
> > > > > It's not correct to calculate the offset of the first archive member by `Data.getBufferStart() + strlen(Magic) + sizeof(Archive::ArFixLenHdrType);`, please use the value of  `ArFixLenHdr->FirstArOffset`.
> > > > > You can easily reproduce the bug if you test a archive file which contains a object file member like the comment I added before:
> > > > > ```
> > > > > $ xlc 1.c -o 1.o
> > > > > $ ar -v -q 1.a 1.o
> > > > > $ llvm-ar tv 1.a
> > > > > llvm-ar: error: unable to load '1.a': truncated or malformed archive (characters in size field in archive header are not all decimal numbers: '\000\000\000\000\000\000\000\000\000\0005892' for archive member header at offset 128)
> > > > > ``` 
> > > > > The correct offset for this case should be 138 instead of 128.
> > > > First, LLVM is no more a priority of our team, so I cannot spend time to this PR.
> > > > Second, I am not able to reproduce. Please give me content of 1.c, and content of 1.a.
> > > > ```
> > > > // small.c
> > > > int add_two (int a) {
> > > >         return a + 2;
> > > > }
> > > > 
> > > > $ xlc -v
> > > > C for AIX Compiler, Version 5
> > > > 
> > > > OK, really old, but we don't care for archive.
> > > > 
> > > > $ xlc small.c  -o small.o -c
> > > > $ ar -v -q small.a small.o
> > > > 
> > > > $ xxd small.a
> > > > 
> > > > 00000000: 3c62 6967 6166 3e0a 3130 3730 2020 2020  <bigaf>.1070
> > > > 00000010: 2020 2020 2020 2020 2020 2020 3132 3332              1232
> > > > 00000020: 2020 2020 2020 2020 2020 2020 2020 2020
> > > > 00000030: 3020 2020 2020 2020 2020 2020 2020 2020  0
> > > > 00000040: 2020 2020 3132 3820 2020 2020 2020 2020      128
> > > > 00000050: 2020 2020 2020 2020 3132 3820 2020 2020          128
> > > > 00000060: 2020 2020 2020 2020 2020 2020 3020 2020              0
> > > > 00000070: 2020 2020 2020 2020 2020 2020 2020 2020
> > > > // End of Fixed-Length Header: size is 128
> > > > 00000080: 3832 3020 2020 2020 2020 2020 2020 2020  820
> > > > 00000090: 2020 2020 3130 3730 2020 2020 2020 2020      1070
> > > > 000000a0: 2020 2020 2020 2020 3020 2020 2020 2020          0
> > > > 000000b0: 2020 2020 2020 2020 2020 2020 3136 3331              1631
> > > > 000000c0: 3630 3435 3039 2020 3020 2020 2020 2020  604509  0
> > > > 000000d0: 2020 2020 3020 2020 2020 2020 2020 2020      0
> > > > 000000e0: 3634 3420 2020 2020 2020 2020 3720 2020  644         7
> > > > [...]
> > > > 
> > > > $ llvm-ar --version
> > > > LLVM version 13.0.0
> > > > 
> > > > $ llvm-ar t small.a
> > > > small.o # OK
> > > > ```
> > > > 
> > > > Third, according documentation https://www.ibm.com/docs/en/aix/7.2?topic=formats-ar-file-format-big , Fixed-Length Header must have a size of 128. 138 is not valide.
> > > Thanks for your reply!
> > > 1. If you are no longer following up on this patch, I'd be happy to continue your work. How do you think?
> > > 
> > > 2. In my previous comment, a binary member is added to the archive, while in your case, there is an object member (compiled with `-c` option) in archive.
> > > You can also reproduce the issue if you add a dynamic lib to an archive.
> > > ``` 
> > > $ cat 1.c
> > > int main() {
> > > return 1;
> > > }
> > > $ xlc 1.c -qmkshrobj -o libt.so
> > > $ ar -v -q 1.a libt.so
> > > $ xxd 1.a 
> > > 00000000: 3c62 6967 6166 3e0a 3134 3532 2020 2020  <bigaf>.1452    
> > > 00000010: 2020 2020 2020 2020 2020 2020 3136 3134              1614
> > > 00000020: 2020 2020 2020 2020 2020 2020 2020 2020                  
> > > 00000030: 3020 2020 2020 2020 2020 2020 2020 2020  0               
> > > 00000040: 2020 2020 3133 3420 2020 2020 2020 2020      134         
> > > 00000050: 2020 2020 2020 2020 3133 3420 2020 2020          134     
> > > 00000060: 2020 2020 2020 2020 2020 2020 3020 2020              0   
> > > 00000070: 2020 2020 2020 2020 2020 2020 2020 2020                  
> > > 00000080: 0000 0000 0000 3131 3935 2020 2020 2020  ......1195    
> > > // There are some padding between the Fixed-Length Header and the First member.  
> > > 00000090: 2020 2020 2020 2020 2020 3134 3532 2020            1452  
> > > 000000a0: 2020 2020 2020 2020 2020 2020 2020 3020                0 
> > > 000000b0: 2020 2020 2020 2020 2020 2020 2020 2020
> > > ```
> > > 
> > > 3. It's correct that `Fixed-Length Header must have a size of 128. 138 is not valide.` As above, there may be some padding between the Fixed-Length Header and the first member. So I think the value of  `ArFixLenHdr->FirstArOffset` is the exact offset to the first member.
> > Using ArFixLenHdr->FirstArOffset does not work.
> > ```
> > truncated or malformed archive (characters in name length field in archive header are not all decimal numbers: '' for archive member header at offset 68)
> > ```
> > I do not know where this "68" is from.
> > 
> > In my opinion, this PR is complex enough. It might be accepted with as few changes as possible. So, without freelist, undocumented features... In a second time, you can create a new PR, to extend support of Big Archive, with new tests.
> OK, thanks!
> Do you get the FirstArOffset value via `StringRef(ArFixLenHdr->FirstArOffset,  sizeof(ArFixLenHdr->FirstArOffset)).rtrim(' ').getAsInteger(10, Size)` ?
My mistake, you are right. I take FirstArOffset  directly...
But in order to treat the padded case, you must add test, with binary file. LLVM community want to avoid adding binary to test. So, I think again a separate PR after this one is preferable.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100651/new/

https://reviews.llvm.org/D100651



More information about the llvm-commits mailing list