[PATCH] D50343: [llvm-objcopy] Add support for -I binary -B <arch>.

James Henderson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Aug 10 03:16:22 PDT 2018


jhenderson added inline comments.


================
Comment at: test/tools/llvm-objcopy/binary-input-aarch64.test:1
+# RUN: printf abcd > %t.x-txt
+# RUN: llvm-objcopy -I binary -B aarch64 %t.x-txt %t.o
----------------
Use echo, not printf. I'm not sure if printf is implemented in the lit test system etc, but echo is the more common approach anyway.


================
Comment at: test/tools/llvm-objcopy/binary-input-aarch64.test:3
+# RUN: llvm-objcopy -I binary -B aarch64 %t.x-txt %t.o
+# RUN: llvm-readobj -file-headers -sections -symbols %t.o | FileCheck %s
+
----------------
You dump the symbols here, but don't check any of them... you should probably do so!


================
Comment at: test/tools/llvm-objcopy/binary-input-aarch64.test:8
+# CHECK-NEXT: AddressSize: 64bit
+# CHECK-NEXT: LoadName:
+
----------------
Does this line add anything? If not, can we remove it?


================
Comment at: test/tools/llvm-objcopy/binary-input-arm.test:1
+# RUN: printf abcd > %t.x-txt
+# RUN: llvm-objcopy -I binary -B arm %t.x-txt %t.o
----------------
Could you fold all of the regular tests into a single test file? There isn't much difference between them. Just run llvm-objcopy N times in a single file with different -B options, and use check prefixes to match the different sections (e.g. COMMON, ARM, AARCH64... etc).


================
Comment at: test/tools/llvm-objcopy/binary-input-error.test:1
+# RUN: printf abcd > %t.txt
+
----------------
printf -> echo


================
Comment at: test/tools/llvm-objcopy/binary-input-reconstitute.test:1
+# RUN: printf abcd > %t.txt
+
----------------
I'm not sure I understand what "reconstitute" means here. Perhaps rename the test? I assume the aim is for binary input and binary output? So maybe a better name is "binary-input-and-output.test".


================
Comment at: test/tools/llvm-objcopy/binary-input-reconstitute.test:10
+
+# CHECK: abcd
----------------
Technically, this only shows that the contents include "abcd", not that they are explicitly and solely "abcd". So this would also match a file containing other junk.

A better way might be to just compare the output files against the original input files using diff. Also, a common technique in our tests is to copy the input file to a separate file before running llvm-objcopy and then diff the input against the copy to show that the input hasn't been modified.


================
Comment at: test/tools/llvm-objcopy/binary-input.test:2
+# RUN: printf abcd > %t.x-txt
+# RUN: llvm-objcopy -I binary -B i386:x86-64 %t.x-txt %t.o
+# RUN: llvm-readobj -file-headers -sections -symbols %t.o | FileCheck %s
----------------
I'm not sure how this test is different from the others (apart from the symbols). Is i386:x86-64 just a synonym of x86-64? If so, I think this test can be folded into the others, like mentioned above.


================
Comment at: tools/llvm-objcopy/Object.cpp:66
 template <class ELFT> void ELFWriter<ELFT>::writePhdr(const Segment &Seg) {
-  using Elf_Phdr = typename ELFT::Phdr;
-
----------------
Could you do these little unrelated tidy-ups in a separate NFC commit, please. No need for a review. Essentially, I'd like this review to just be of the things required for your change.


================
Comment at: tools/llvm-objcopy/Object.cpp:621
+template <class ELFT> void BinaryELFBuilder<ELFT>::addData() {
+  auto AR = ArrayRef<uint8_t>(
+      reinterpret_cast<const uint8_t *>(MemBuf->getBufferStart()),
----------------
AR -> Data. "AR" is meaningless.

It saddens me that different parts of LLVM can't agree on whether to use chars or uint8_t.


================
Comment at: tools/llvm-objcopy/Object.cpp:624
+      MemBuf->getBufferSize());
+  auto &DS = Obj->addSection<Section>(AR);
+  DS.Name = ".data";
----------------
Rather than abbreviate this to something that is unclear, just call it "DataSection"


================
Comment at: tools/llvm-objcopy/Object.cpp:633
+                  [](char c) { return !isalnum(c); }, '_');
+  Twine S = Twine("_binary_") + SanitizedFilename;
+
----------------
I think it's considered bad form to have explicit local variable Twines that aren't just function parameters, if I remember, based on comments in a previous review (I might be mistaken though, so am happy to be proven wrong).

Please also name it something more descriptive like "Prefix".


================
Comment at: tools/llvm-objcopy/Object.cpp:642
+                    0, 0);
+  SymTab->addSymbol(S + "_end", STB_GLOBAL, STT_NOTYPE, &DS, DS.Size,
+                    STV_DEFAULT, 0, 0);
----------------
Should the end symbol have a size? That seems weird that it does. I might expect the start symbol to, but not the end symbol. What does GNU objcopy do?


================
Comment at: tools/llvm-objcopy/Object.cpp:1249
 
+template <class ELFT> void ELFWriter<ELFT>::initEhdr() {
+  auto &ElfHdr = Obj.ElfHdrSegment;
----------------
Could you rename this initEhdrSegment, since it doesn't actually initialise the ELF header itself.


================
Comment at: tools/llvm-objcopy/Object.cpp:1261
 template <class ELFT> void ELFWriter<ELFT>::assignOffsets() {
+  initEhdr();
   // We need a temporary list of segments that has a special order to it
----------------
I'm not convinced that the ELF header segment should be initialised inside something called "assignOffsets", since it's doing a lot more than assigning it an offset (which actually is always 0, so could be initialised at the same time as the rest of the header). It should probably be called outside this function.


================
Comment at: tools/llvm-objcopy/Object.h:402
   uint32_t Index;
-  StringRef Name;
+  std::string Name;
   uint32_t NameIndex;
----------------
Side note: I think a recent change for --prefix-symbols makes this part of the diff identical. I recommend rebasing when you next update the diff.


================
Comment at: tools/llvm-objcopy/Object.h:457
+                 uint64_t Value, uint8_t Visibility, uint16_t Shndx,
+                 uint64_t Sz);
   void prepareForLayout();
----------------
Don't abbreviate variable names: Sz -> Size.


================
Comment at: tools/llvm-objcopy/llvm-objcopy.cpp:133
   StringRef InputFormat;
-  StringRef BinaryArch;
+  MachineInfo BinaryArch;
 
----------------
It looks to me like we are grouping things in CopyConfig by type, so this should probably be moved to before (or after) the StringRef block.


================
Comment at: tools/llvm-objcopy/llvm-objcopy.cpp:302
+
+static const StringMap<MachineInfo> ArchMap{
+    {"aarch64", MI_AARCH64}, {"arm", MI_ARM},
----------------
I feel like this might read better if everything is constructed inline:
```
static const StringMap<MachineInfo> ArchMap{
    // Name,     EM value,   64bit, LittleEndian
    {"aarch64", {EM_AARCH64, true,  true}},
    {"arm",     {EM_ARM,     false, true}},
    /* More entries here */
};
```

I'd order the entries either alphabetically by name or numerically by their EM value. You could also put "headers" as shown to avoid needing to duplicate the 64/Endianness comment in each part.




Repository:
  rL LLVM

https://reviews.llvm.org/D50343





More information about the llvm-commits mailing list