[PATCH] D49016: [llvm-objdump] Add dynamic section printing to private-headers option

Mon Jul 16 03:53:09 PDT 2018

jhenderson added a comment.

In https://reviews.llvm.org/D49016#1163205, @paulsemel wrote:

> Little follow-up. I'm pretty sure it would require changes to `yaml2obj` to get this test written in yaml format.
>  Why not creating my own binary for the moment, and wait for someone (me if I have time ?) to implement this feature ?

I think this would be a reasonable way forward.

I'm still concerned by the amount of duplication between llvm-readobj's handling of this and llvm-objdump's. My impression is that only the printing and error handling sections should be distinct. The methods for looking up values etc should be in libObject.

I'm out of time now, so haven't looked at everything yet, but hopefully this will help you continue moving.

================
Comment at: include/llvm/Object/ELF.h:459
+
+  for (auto &Phdr : *ProgramHeadersOrError) {
+    if (Phdr.p_type == ELF::PT_DYNAMIC) {
----------------
Too much auto. Please use the explicit type here.

================
Comment at: include/llvm/Object/ELF.h:474
+
+    for (auto &Sec : *SectionsOrError) {
+      if (Sec.sh_type == ELF::SHT_DYNAMIC) {
----------------
Similar to above, please use the explicit type here.

================
Comment at: include/llvm/Object/ELF.h:476
+      if (Sec.sh_type == ELF::SHT_DYNAMIC) {
+        Dyn = reinterpret_cast<const Elf_Dyn *>(base() + Sec.sh_offset);
+        DynSecSize = Sec.sh_size;
----------------
You can use getSectionContentsAsArray here, I think.

================
Comment at: include/llvm/Object/ELF.h:488
+
+  if (numEntry == 0)
+    return createError("invalid empty dynamic section");
----------------
What about if `DynSecSize % sizeof(Elf_Dyn)` is non-zero? That should probably be an error too.

================
Comment at: include/llvm/Object/ELF.h:491
+
+  const Elf_Dyn *End = Dyn + (numEntry - 1);
+  if (End->d_tag != ELF::DT_NULL)
----------------
I would call this "Last" or similar, since End implies one past the end in iterator terminology. That being said, it might just be easier to do the makeArrayRef before this and use the methods on the ArrayRef instead to query the last element (specifically `back`).

================
Comment at: include/llvm/Object/ELF.h:478
+
+  return makeArrayRef(Dyn, numEntry);
+}
----------------
paulsemel wrote:
> jhenderson wrote:
> > I wonder if we should cache this result somehow, so that the majority of this function is only run once per file?
> It's up to you. For the moment, I've just mimicked the behavior of `program_headers` and `sections` :)
Okay, sure. I guess it's unlikely a program will need to do this multiple times, and it could be the thing to do the caching if its important. So no need to do it, I think.

================
Comment at: test/tools/llvm-objdump/private-headers-dynamic-section.test:1
+# RUN: llvm-objdump -p %p/Inputs/eh_frame.elf-mipsel | FileCheck %s
+
----------------
paulsemel wrote:
> jhenderson wrote:
> > paulsemel wrote:
> > > jhenderson wrote:
> > > > As with other tests, I'd rather not use a pre-canned binary here, especially one that isn't directly associated with this test. Please use yaml2obj.
> > > Ok.. I gave it a try. The problem is not about building the dynamic section (though it might lead to confusing tests), but more about all the needed parts we need to have to get a relevant test (Program headers, string tables etc..)
> > > What do you think ?
> > Does yaml2obj not allow you to pass in arbitrary values to the dynamic tags then?
> > 
> > Regardless, that complexity is there already, it's just hiding away in a pre-canned binary, which will make it harder to work out what to do with this test, if that binary is ever updated and/or this test ever starts to fail. I'd rather we be explicit, and not tie ourselves to that binary, if at all possible.
> Ok, I'll try to be more verbose on this. Here is what I would like to have in yaml2obj :
> 
> ```
> Dynamic:
>         DT_NEEDED            "libc.so"
>         DT_WHATEVER       0x12345678
> ```
> 
> And here is what we have :
> 
> ```
> Sections:
>   - Name:            .dynamic
>     Type:            SHT_DYNAMIC
>     Flags:           [ SHF_ALLOC, SHF_WRITE ]
>     Content:         0x0000001012002030000400
> ```
> And for each reference in the dynamic section, we need to trick to build a correct binary. (thus in our case, we need to build a correct program_headers, strtab etc..). Really, it'll just get a big mess and loss of time.
> 
> This feature might definitely be added in yaml2obj. I admit that I don't really feel like building that a complicated yaml file :)
Okay, those are all fair comments. I'm happy for a pre-canned binary, but I'd prefer it if it is dedicated to this feature.

================
Comment at: tools/llvm-objdump/ELFDump.cpp:28
+                                     uint64_t VirtBase) {
+  for (; Dyn->d_tag != ELF::DT_NULL; Dyn++) {
+    if (Dyn->d_tag == ELF::DT_STRTAB)
----------------
paulsemel wrote:
> jhenderson wrote:
> > paulsemel wrote:
> > > jhenderson wrote:
> > > > What happens if you have a malformed segment with no DT_NULL tag?
> > > > 
> > > > You should probably use the program header's p_filesz field as an additional check to avoid overrunning the end of the array. If we hit p_filesz and not DT_NULL, we should probably warn. Assuming you can get yaml2obj to create such a section, I'd also expect a test case for that.
> > > Ok.. I finally added a helper function in libObject, I think it is the right way to do it.
> > > Moreover, DT_NULL is needed by the ELF specification (it must end the Dynamic section), so let's issue an error.
> > Normally, I'd agree with you, but objdump's job is not to manipulate things, only to give information about things. That means that we should try to give as much information as possible, especially as in this case, a missing DT_NULL does not prevent us doing so. Emitting an error will presumably stop us doing any further dumps the user has requested. So I think a warning would be more appropriate.
> Is it even possible to issue warnings from libObject ? (as now this code has moved to the libobj)
I'd expect library code to emit "Error" or "Expected" returns, and then the consumer to decide what to do with them. You can see a similar thing I did earlier this year for the DWARF debug line parser, which now returns Expected/Error instances, and the consumers often convert these into warnings (see rL331971).

================
Comment at: tools/llvm-objdump/ELFDump.cpp:21

+#define ARRAY_SIZE(a) (sizeof(a) / sizeof(*(a)))
+
----------------
paulsemel wrote:
> jhenderson wrote:
> > I'd be surprised if this doesn't already exist in the LLVM support library somewhere...
> Tried to grep but was there. Do you know where this kind of macros are in LLVM ?
Well, I'd expect this to be an STL extension, since it's in C++17 (see https://en.cppreference.com/w/cpp/iterator/size), and I see that there is a size method in STLExtras.h. Try taking a look at that. I'm not sure it'll quite work, as I think it is implementing version 1 from the reference page only, but maybe you could try putting up a separate review for adding version 2 support?

================
Comment at: tools/llvm-objdump/ELFDump.cpp:51
+
+  return createError("DT_STRTAB entry not present");
+}
----------------
jhenderson wrote:
> Similar to my earlier comments, maybe it would be better to warn for this case, and fall back to just printing the hex value, like you suggested.
> 
> This probably goes for all errors that result in us being unable to load the dynamic string table. Also, where possible, we should test each of those errors.
Why not just propagate the Error back up the stack? You are already returning an Expected.

================
Comment at: tools/llvm-objdump/ELFDump.cpp:59
+  if (!ProgramHeaderOrError)
+    report_fatal_error(
+        errorToErrorCode(ProgramHeaderOrError.takeError()).message());
----------------
paulsemel wrote:
> jhenderson wrote:
> > This and others should be "report_error" available in llvm-objdump.h, not report_fatal_error. They also just take an Error, so you don't have to mess about with errorToErrorCode.
> Well, I can't access file name here, I just mimicked the behavior above, I think this is also the reason they did this no ?
> I'm probably missing something
I'm not sure which one this lined up with before, but it looks to me that you have two options in most cases, 1) return Error/Expected up the stack, so that the call site handles it, or 2) Pass the file name into the function.

Repository:
  rL LLVM

https://reviews.llvm.org/D49016