[PATCH] D122069: [Object] Add binary format for bundling offloading metadata

Joseph Huber via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Mar 29 08:06:33 PDT 2022


jhuber6 added a comment.

In D122069#3413960 <https://reviews.llvm.org/D122069#3413960>, @JonChesterfield wrote:

> Added some reviewers. I'd much prefer this used an existing binary format, DIY is prone to errors and maintenance hassles down the road. Don't care as much about which format as about it being one with an existing, tested implementation and ideally existing inspection tools.

I'm not married to the idea, and worst case scenario we can replace it with something else in the future. I'd like to get something like this working so I can finally make the new driver the default, so I'll just outline the problem and some of the potential solutions.

The issue is that we need to store some metadata along with a binary image so we know how to handle it at a later date. Currently we shove this in the name of the ELF section and parse it there, but that's not idea because more metadata will be needed in the future and it prevents us from doing things like relocatable linking or other merging (e.g. we want to store both an sm_70 and sm_35 image in the same file). So we want a binary format that can store some strings, other data. I'll just go over a brief overview of the options:

YAML / JSON
-----------

+ Ubiquitous simple text format for encoding object data
+ In-tree implementation
x Requires encoding to store the device image
x Will need binary padding and size calculation to make sure these merge properly in a section

Protocol Buffers
----------------

+ Well-tested
+ Implicit appending, no additional code required to handle merged sections.
x Out-of-tree, requires external dependencies to build and maintain in the future. No other use in Clang / LLVM

ELF
---

+ Ubiquitous tooling. Object extraction and copying for free
+ Simple key-value storage 
x Difficult to calculate size, will need to figure out the size of the buffer and write it in later so we can read multiple appended sections.
x Difficult to create. The Elf object writer is completely tied to the MC backend. YAML2ELF would require base64 or similar again

MSGPACK
-------

+ Exists in-tree in some form and well tested
+ Supports key-value storage
x Doesn't know its size, will need to add padding and a size field

Custom format
-------------

+ Relatively simple implementation that solves this specific problem
x No existing tooling support, more error prone

I decided to go with the custom format because it was the simplest to get working for a proof of concept to solve the problem I was immediately facing. I think ELF would be the next best if someones could suggest a way to write the data and get the size. MSGPACK seems to be @JonChesterfield's preferred method because it has a lot of use at AMD, it would work as long as we can figure out its size and get alignment. Let me know what suggestions you have because I really want to move forward with this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122069/new/

https://reviews.llvm.org/D122069



More information about the cfe-commits mailing list