[llvm-dev] llvm-objcopy proposal

Sean Silva via llvm-dev llvm-dev at lists.llvm.org
Tue Jun 6 22:32:28 PDT 2017


On Tue, Jun 6, 2017 at 4:03 PM, Jake Ehrlich <jakehehrlich at google.com>
wrote:

> Fantastic! Thanks for all of the input! I'll be considering all of it
> going forward. The plan right now is just to worry about ELF executables
> and nothing else. I'm very sympathetic to the "llvm-objtool" change. If
> everyone is cool with it I'll change the name in the next CL to
> "llvm-objtool".
>
> To start out I implemented a very basic ELF64LE specific bit of code. I'm
> currently looking for reviewers on it. The phabricator link is here:
> https://reviews.llvm.org/D33964. I'd like to find people willing to
> review this as I work on this going forward as well. I haven't bothered
> worrying about it but I imagine that this will template fairly easily to
> support ELF32LE, ELF32BE, and ELF64BE.
>

Yep. If you haven't found it, take a look at our "ELFT" infrastructure
which should allow easily templating this. A really simple example is the
ELF part of yaml2obj (tools/yaml2obj/yaml2elf.cpp). LLD is another example
that uses ELFT to work across all 4 combinations.
ELFT is so easy to use that going forward you probably won't find yourself
needing to write an initial version for a specific {endian,is64bit}
combination.

Also, one thing to keep in mind is that types like llvm::ELF::Elf64_Word
will have the host endianness, which may causes output differences across
different host platforms if they sneak into the output buffer (we do have
some big endian bots, and making sure that tool output is deterministic
across host endianness is a goal of LLVM tools and such differences are
considered bugs). So you may find yourself wanting to use ELFT even in the
initial patch. By using ELFT everywhere, you make sure that things are
guaranteed correct. It's then fairly easy to remove it as needed.

That is exactly what happened in LLD/ELF. We started with everything ELFT
so there was no chance of bugs, then later on once the project was
stabilizing we detemplated many places it to make code simpler when there
wasn't any risk of getting it wrong (for example, in many places you can
just use uint64_t instead of a type that is 32 or 64 bits depending on
ELFT). Also, at the point where we were removing the ELFT templating, we
already had tons of test coverage. AFAIK, thanks to ELFT and that
methodology, LLD/ELF has had zero (really, *zero*; I can't think of a
single one) bugs due to endianness/64bit-ness mixups, despite being a tool
that natively supports all 4 combinations simultaneously and operates on
endian/64bit dependent values read from object files on almost every single
line of code. It's very impressive, and big thanks to Michael for all the
packed_endian_specific_integral / ELFT infrastructure (now if only he would
get packed_endian_specific_integral into the C++ standard :P).

-- Sean Silva


>
>
> Would anyone be willing to let me set them as a reviewer going forward for
> future CLs?
>
> On Sun, Jun 4, 2017 at 6:07 PM Sean Silva via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> On Fri, Jun 2, 2017 at 3:52 PM, James Y Knight via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>>
>>>
>>> On Fri, Jun 2, 2017 at 2:34 PM, Ed Maste via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> One additional use case for you: converting from a binary to an ELF
>>>> object file
>>>> ```
>>>> objcopy -I binary -O elf64-x86-64 foo.bin foo.o
>>>> ```
>>>> This is sometimes used for embedding binary files for use by drivers
>>>> and such.
>>>>
>>>
>>> Yea, unfortunately the command-line you actually end up needing is more
>>> like:
>>>   objcopy -I binary -Bi386:x86-64 -Oelf64-x86-64 --rename-section
>>> .data=.rodata,alloc,load,readonly,data,contents --add-section
>>> .note.GNU-stack=/dev/null
>>>
>>> Having to manually invoke objcopy and know what to specify for the -B
>>> and -O options, and to know you need the .note.GNU-stack section, and how
>>> to move it into rodata...it's really all quite terrible. Nobody should have
>>> to do that. :(
>>>
>>> There's also the "-b binary" flag to GNU ld (both bfd and gold). But,
>>> you typically need to do a dedicated "link" for that. You do:
>>>   ld -r -b binary picture.jpg -o foo.o
>>> How does ld know what output format to use here? It's gotta just choose
>>> the default, which is kinda poor...or the user needs to know how to spell
>>> an "emulation" and output format...
>>>
>>
>> One way to hack around this might be to pass in one of the other object
>> files in your project, and have the output .o file replace it. Still pretty
>> hacky and brittle (and hard to integrate into a build system I would think).
>>
>>
>>>
>>> You could imagine trying to use -Wl to put it with the compile command,
>>> but what do you use to switch back to the normal object format?
>>>   gcc main.c -Wl,-b -Wl,binary -Wl,picture.jpg -Wl,-b -Wl,<<something to
>>> undo binary mode?>>
>>>
>>> So, anyways, while this is _possible_ with objcopy, it'd sure be nice if
>>> you never needed to use it for that...
>>>
>>
>> The other approaches I've seen or can imagine are:
>>
>> - Assembler `.incbin` directive (could use it from an inline asm).
>> - Use a "bin2h" type program which takes a binary and spits out a C file
>> with a giant uint8_t[] literal in it, then include that in one of your
>> normal .c files. In theory a C++11 raw string literal could bypass most of
>> the parsing overhead of a big array literal, but the people that care about
>> including a binary in their program probably don't care about that.
>>
>> -- Sean Silva
>>
>>
>>>
>>> (BTW, Apple ld actually has an option "-sectcreate SEGNAME SECTNAME
>>> INPUT_FILE", and the clang driver will pass it through to the linker.)
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170606/745d0ed0/attachment-0001.html>


More information about the llvm-dev mailing list