[PATCH] D48413: [llvm-objcopy] Add initial support for static libraries

Thu Jun 28 02:53:30 PDT 2018

jhenderson added inline comments.

================
Comment at: test/tools/llvm-objcopy/basic-archive-copy.test:8
+# RUN: llvm-objcopy %t %t2
+# RUN: llvm-ar p %t2.a >%t3
+# RUN: cmp %t2 %t3
----------------
Nit here and elsewhere ">%t3" -> "> %t3" for better readability.

================
Comment at: test/tools/llvm-objcopy/basic-archive-copy.test:12-13
+# RUN: llvm-readobj -sections %t2 | FileCheck %s
+# RUN: llvm-nm -print-armap %t.a | FileCheck --check-prefix=INDEX-TABLE %s
+# RUN: llvm-nm -print-armap %t2.a | FileCheck --check-prefix=INDEX-TABLE %s
+
----------------
Aside: print-armap is not in the llvm-nm documentation. From my understanding, it's a way to check the archive members and their symbols?

================
Comment at: test/tools/llvm-objcopy/basic-archive-copy.test:27-28
+
+# RUN: cp %t.a %t.broken.a
+# RUN: llvm-ar r %t.broken.a %s
+
----------------
I wouldn't call this broken, as the ar format does allow for non-object members. Maybe "non-object" would be better?

================
Comment at: test/tools/llvm-objcopy/basic-archive-copy.test:30
+
+# RUN: not llvm-objcopy %t.broken.a %t2.broken.a 2>&1 | FileCheck %s --check-prefix=BAD-FORMAT
+
----------------
Does gnu objcopy produce an error in this case? If not, what does it do?

We should also check the case that if llvm-objcopy reports an error, what is the state of the archive? I'd expect it to be unchanged, even if the final member is the one with the error, but I'm open to other behaviours.

================
Comment at: test/tools/llvm-objcopy/strip-all.test:27
+# RUN: llvm-strip %t.a
+# RUN: llvm-ar -p %t.a %basename_t.tmp >%t6
+# RUN: cmp %t2 %t6
----------------
alexshap wrote:
> jhenderson wrote:
> > alexshap wrote:
> > > jhenderson wrote:
> > > > The "-" part of "-p" is unnecessary so can be removed (either that or you should add it to "crs" for consistency).
> > > > 
> > > > What is the purpose of "%basename_t.tmp"?
> > > > 
> > > > Does lit use it's own redirection mechanism? I ask because on Windows Powershell encodes files created via ">" as UTF-16, which obviously isn't suitable for binary files. I'd be more inclined to use the extract operation (x) to achieve the same thing, as it's less likely to cause a problem.
> > > %basename_t.tmp is expanded by lit into the basename of %t .
> > > so llvm-ar p %t.a %basename_t.tmp prints the content of a given file from the archive.
> > > -x will extract a file into the current folder so it will overwrite the existing file, I somehow didn't like this.
> > >  
> > Yeah, the overwriting isn't great, and I get why you didn't want to use -x in the first place. But does the redirection work for all users? I might have to test this myself, since I use Windows and Powershell as my primary development environment, and it's important that the binary file doesn't get encoded as it gets printed out.
> > 
> > So we need %basename_t because %t is an absolute path, right? I think this sounds more like a deficiency in the test framework - I don't like the need to know that %t ends up as <some path>.tmp for this test to work, as it should be an implementation detail that the test doesn't care about. That being said, it's probably outside the scope of this change to worry about that, if there is no clear alternative (but perhaps a new macro that expands to <the same as %t without the directory components> might be appropriate at some point).
> yeah
FTR: I checked the redirection stuff, and it seems to have worked fine.

================
Comment at: tools/llvm-objcopy/Object.h:130
+
+class Buffer {
+  StringRef Name;
----------------
I'm probably missing something, but what's wrong with the existing buffer classes in llvm? Would it make more sense to extend them in the general case, rather than a specific reimplementation here?

It would be nice to see some comments describing what the interface functions do.

================
Comment at: tools/llvm-objcopy/llvm-objcopy.cpp:447

-std::unique_ptr<Reader> CreateReader(StringRef InputFilename,
-                                     ElfType &OutputElfType) {
-  // Right now we can only read ELF files so there's only one reader;
-  auto Out = llvm::make_unique<ELFReader>(InputFilename);
-  // We need to set the default ElfType for output.
-  OutputElfType = Out->getElfType();
-  return std::move(Out);
+void ExecuteElfObjcopyAgainstArchive(const CopyConfig &Config,
+                                     const Archive &Ar) {
----------------
Against -> On

This function is a bit dense for my liking, and I'm actually having trouble reading it. Could you insert some blank lines between logical blocks within it and maybe add some comments explaining what is happening, please?

================
Comment at: tools/llvm-objcopy/llvm-objcopy.cpp:493
+  FileBuffer FB(Config.OutputFilename);
+  HandleArgs(Config, *Obj, Reader, Reader.getElfType());
+  std::unique_ptr<Writer> Writer =
----------------
I'd probably insert an extra blank line here to divide off the writing block. Maybe even a line before as well, since reading, manipulating and writing are three distinct phases in the program.

https://reviews.llvm.org/D48413