[PATCH] D50343: [llvm-objcopy] Add support for -I binary -B <arch>.

Jordan Rupprecht via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Aug 10 11:36:07 PDT 2018


rupprecht added inline comments.


================
Comment at: test/tools/llvm-objcopy/binary-input-aarch64.test:1
+# RUN: printf abcd > %t.x-txt
+# RUN: llvm-objcopy -I binary -B aarch64 %t.x-txt %t.o
----------------
jhenderson wrote:
> Use echo, not printf. I'm not sure if printf is implemented in the lit test system etc, but echo is the more common approach anyway.
Done -- I think printf was chosen to avoid the \n, but that can be done with "echo -n".

Looks like lit does support it, as it's used in test/tools/llvm-objcopy/add-gnu-debuglink.test, as well as some non-objcopy tests. I'll resist the temptation to fix those in this patch :)


================
Comment at: test/tools/llvm-objcopy/binary-input-aarch64.test:3
+# RUN: llvm-objcopy -I binary -B aarch64 %t.x-txt %t.o
+# RUN: llvm-readobj -file-headers -sections -symbols %t.o | FileCheck %s
+
----------------
jhenderson wrote:
> You dump the symbols here, but don't check any of them... you should probably do so!
Actually, the point of these arch-specific tests is just to check the file header. I removed -sections and -symbols from this test; those are checked by binary-input.test, which shouldn't depend much on the specific arch.


================
Comment at: test/tools/llvm-objcopy/binary-input-reconstitute.test:1
+# RUN: printf abcd > %t.txt
+
----------------
jhenderson wrote:
> I'm not sure I understand what "reconstitute" means here. Perhaps rename the test? I assume the aim is for binary input and binary output? So maybe a better name is "binary-input-and-output.test".
Yes, essentially it's an integration test that some payload doesn't change when it's converted back and forth (i.e. payload -> object file w/ payload shoved into .data -> payload). I'm not sure why I thought "reconstitute" would be a good word for that, so I'll take your name suggestion.


================
Comment at: test/tools/llvm-objcopy/binary-input.test:2
+# RUN: printf abcd > %t.x-txt
+# RUN: llvm-objcopy -I binary -B i386:x86-64 %t.x-txt %t.o
+# RUN: llvm-readobj -file-headers -sections -symbols %t.o | FileCheck %s
----------------
jhenderson wrote:
> I'm not sure how this test is different from the others (apart from the symbols). Is i386:x86-64 just a synonym of x86-64? If so, I think this test can be folded into the others, like mentioned above.
This test is focused on the generic stuff, namely section/symbols. I dropped the arch-specific stuff since that's covered elsewhere.


================
Comment at: tools/llvm-objcopy/Object.cpp:633
+                  [](char c) { return !isalnum(c); }, '_');
+  Twine S = Twine("_binary_") + SanitizedFilename;
+
----------------
jhenderson wrote:
> I think it's considered bad form to have explicit local variable Twines that aren't just function parameters, if I remember, based on comments in a previous review (I might be mistaken though, so am happy to be proven wrong).
> 
> Please also name it something more descriptive like "Prefix".
I originally had string, but changed to Twine based on Jake's suggestion here: https://reviews.llvm.org/D50343?id=159343#inline-442972

I'm trying to understand why it would be a bad idea to write this. Looking at http://llvm.org/docs/ProgrammersManual.html#dss-twine, the discouraged pattern is:

```
void foo(const Twine &T);
...
StringRef X = ...
unsigned i = ...
const Twine &Tmp = X + "." + Twine(i);
foo(Tmp);
```
Which is bad because Tmp is a ref, whereas the Twine here is a regular stack variable that persists past all the calls to addSymbol (which calls .str() on the input twine to save/own the symbol).

I renamed to "Prefix", but kept this as a Twine here, since I think this is safe. Otherwise, I think I'd have to do:
```
SymTab->addSymbol(Twine("_binary_") + SanitizedFilename + "_start", ...);
SymTab->addSymbol(Twine("_binary_") + SanitizedFilename + "_end", ...);
SymTab->addSymbol(Twine("_binary_") + SanitizedFilename + "_size", ...);
```


================
Comment at: tools/llvm-objcopy/Object.cpp:642
+                    0, 0);
+  SymTab->addSymbol(S + "_end", STB_GLOBAL, STT_NOTYPE, &DS, DS.Size,
+                    STV_DEFAULT, 0, 0);
----------------
jhenderson wrote:
> Should the end symbol have a size? That seems weird that it does. I might expect the start symbol to, but not the end symbol. What does GNU objcopy do?
Nope, this is Value, not Size, although it's confusing, especially because addSymbol has so many parameters. I commented the param names here to make this clear.
The sizes/values here match gnu objcopy.


================
Comment at: tools/llvm-objcopy/Object.h:457
+                 uint64_t Value, uint8_t Visibility, uint16_t Shndx,
+                 uint64_t Sz);
   void prepareForLayout();
----------------
jhenderson wrote:
> Don't abbreviate variable names: Sz -> Size.
Done, although this is existing code -- it's only showing up as a diff because clang-format put it on a new line after I changed StringRef->Twine...


================
Comment at: tools/llvm-objcopy/llvm-objcopy.cpp:133
   StringRef InputFormat;
-  StringRef BinaryArch;
+  MachineInfo BinaryArch;
 
----------------
jhenderson wrote:
> It looks to me like we are grouping things in CopyConfig by type, so this should probably be moved to before (or after) the StringRef block.
Done -- I added some comments to try to logically explain how this config is laid out, but I'm not very familiar with all of them... I'm open to suggestions here.


Repository:
  rL LLVM

https://reviews.llvm.org/D50343





More information about the llvm-commits mailing list