[PATCH] D126904: [llvm-objdump] Add support for dumping embedded offloading data

Joseph Huber via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 3 13:04:18 PDT 2022


jhuber6 added inline comments.


================
Comment at: llvm/tools/llvm-objdump/OffloadDump.cpp:38
+
+static Error extractAllBinaries(const OffloadBinary *OB) {
+  uint64_t Offset = 0;
----------------
tra wrote:
> jhuber6 wrote:
> > tra wrote:
> > > `extract` is not quite what we're doing here.  `printAllBinaries` ? But then we already have `printOffloadBinary` below, so it's a bit confusing.
> > > 
> > > IIUIC the hierarchy looks roughly like this:
> > > -  Object file
> > >    - N offload sections
> > >       - M offload binaries
> > > 
> > > So, this function prints content of one section (i.e. a collection of offload binaries), while the function below prints contents of the object file (i.e. a collection of offload sections) and the names currently do not reflect that. 
> > > 
> > > 
> > How about this?
> > ```
> > printOffloadBinary()
> > printOffloadBinaries()
> > dumpOffloadSections()
> > dumpOffloadBinaries()
> > ```
> What's the distinction between dump and print here? It would do for the time being, I guess.
> 
> If we were to implement iterators over sections and offload binaries within, we would not need distinct names for them and then this whole code would look like this:
> 
> ```
> printOffloadBinaries(const ObjectFile *O) {
>   llvm::for_each(O->sections(), [](auto Section) {
>     llvm::for_each(Section->binaries(), printBinary);
>   })
> }
> ```
> That would be expressive enough without having to split multiple levels of iteration into different functions, along with the associated hassle of having to come up with adequate names for them. :-)
> 
> Error handling will likely throw a monkey wrench into this neat ideal scenario, so I'm not sure if it's worth it.
> I think general purpose iterators over sections and offload binaries would come handy when we get to extend the functionality.
> It does not need to be done in this patch.
> 
> 
I just used dump since it's more in-line with the vocabulary of the rest of the functions in `llvm-objdump`. Also in the future if we want to print out these sections we'd just have a flag somewhere that if enabled dumps the contents of the image rather than just the metadata. So that's the difference between just dump and print in my mind.

As you mentioned we can get errors here so we'd need to have some kind of iterator over `Expected`values, But I'm not sure if we would extend the section class for this since this binary format is just some blob that just so happens to be contained in a section I'd say.


================
Comment at: llvm/tools/llvm-objdump/OffloadDump.cpp:78
+
+/// Print the contents of a single offload binary file \p OB.
+void llvm::printOffloadBinary(const OffloadBinary *OB) {
----------------
tra wrote:
> jhuber6 wrote:
> > tra wrote:
> > > I don't think the 'single' part of this assertion is true. AFAICT, `extractAllBinaries` will happily print all subsequent binaries if it finds them in the buffer. I think this should call `printBinary` instead.
> > Yeah, I meant it more like to print on the single file that was already extracted or something. But it can definitely contain multiple. The reason I chose this method is because I wanted something that worked even if these sections were concatenated through a relocatable link or something. So whenever we parse one of these we just check the sizes to make sure there's not another one concatenated to it. I can make the comment less confusing.
> I think the root of the problem here is that we're treating `OffloadBinary` as both the pointer to the binary itself and as a pointer to collection of such binaries.
> 
> I think it's not a good API -- extractAllBinaries gets to look under the hood of the implmentation -- check if containing buffer has extra space beyond the OffloadBinary it's been passed. What if the user places something else in the memory buffer right behind the OffloadBinary object user passed to printOffloadBinary ? They would be within their rights to do so as the function would be expected to care about the content of the `*OB` only.
> 
> I think we should be a bit more pedantic about such things. If we expect to operate on a collection, the API should reflect that. E.g. use SmallVector<OffloadBinary*>. 
> I think implementing `ObjectFile::offload_sections()` and `OffloadSection::offload_binaries()` would help both here and above. Or, possibly, just `ObjectFile::offload_binaries()``if we don't need to care about how binaries are stored in the object file and just wanr offload binaries themselves.
> 
So the problem is we don't know how many of these are in here until we parse it. This requires getting the `size` field within the `OffloadBinary`. So even if we abstracted it to this iterator, it would still need some parsing like this behind the scenes. I could have made the binary format contain many within a single binary image, but like I said I wanted this to be stable under arbitrary concatenation by the linker. I'm not sure if we could have a different API considering the parsing requirements.

This can definitely be problematic, depending on usage. I'm assuming if a user initialized an object on a memory buffer containing a bunch of junk it would probably be fine and just stop once the file is fully parsed. We could probably just ignore a parsing error, basically just stop tryingto read things if we don't catch the magic bytes or there's not enough space left over, but that's probably not ideal.

It's definitely a little obtuse, but I'm not sure if there's a good way to make it work better considering how we parse them.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126904/new/

https://reviews.llvm.org/D126904



More information about the llvm-commits mailing list