[PATCH] D69093: [llvm-objcopy] --add-symbol: fix crash if SHT_SYMTAB does not exist

Jordan Rupprecht via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 21 10:36:53 PDT 2019


rupprecht added inline comments.


================
Comment at: llvm/tools/llvm-objcopy/ELF/Object.cpp:1550-1552
+        // Prefer .strtab to .shstrtab.
+        if (Obj.SectionNames != &Sec)
+          break;
----------------
MaskRay wrote:
> grimar wrote:
> > grimar wrote:
> > > MaskRay wrote:
> > > > abrachet wrote:
> > > > > rupprecht wrote:
> > > > > > MaskRay wrote:
> > > > > > > rupprecht wrote:
> > > > > > > > Should `.dynstr` also be avoided? Is it better to just check that the name is literally `.strtab`?
> > > > > > > .dynstr is avoided by `!(Sec.Flags & SHF_ALLOC)`. `.strtab` works as well but `Obj.SectionNames != &Sec` is more general. There are multiple ways to accomplish the samething. I just wanted to make the code less magical.
> > > > > > From: http://www.sco.com/developers/gabi/latest/ch4.sheader.html#special_sections
> > > > > > > .strtab
> > > > > > > This section holds strings, most commonly the strings that represent the names associated with symbol table entries. If the file has a loadable segment that includes the symbol string table, the section's attributes will include the SHF_ALLOC bit; otherwise, that bit will be off.
> > > > > > 
> > > > > > So, `.strtab` can also be `SHF_ALLOC`.
> > > > > For reference, we used to have a file in test/tools/llvm-objcopy/ELF/Inputs/ called alloc-symtab which had an SHF_ALLOC .strtab. It was removed in D65278. 
> > > > To confirm, I think `... && !(Sec.Flags & SHF_ALLOC)` is correct here.
> > > > For reference, we used to have a file in test/tools/llvm-objcopy/ELF/Inputs/ called alloc-symtab which had an SHF_ALLOC .strtab. It was removed in D65278.
> > > 
> > > Thanks for the reference. I completely forgot that saw an allocatable `.symtab` that time.
> > > 
> > > To clarify: in D65278 I've replaced a precompiled binary with a YAML description (with a SHF_ALLOC `.symtab`).
> > > The logic of tests shouldn't have been affected, because tests wanted to see the SHF_ALLOC symbol table (but not SHF_ALLOC string table)
> > > I believe.
> > > 
> > > If we need a test with a SHF_ALLOC `.strtab`, it should be possible to craft one with a yaml2obj too I think.
> > > (I haven't check, but I think all that needed is to define a string table section and a `Flag` property which should override the default).
> > > 
> > > So, .strtab can also be SHF_ALLOC.
> > 
> > > To confirm, I think ... && !(Sec.Flags & SHF_ALLOC) is correct here.
> > 
> > To summarize and clarify this a bit:
> > This part of code is called when there is no symbol table (`.symtab`).
> > Jordan mentioned that `.strtab` can be allocatable in according to specification.
> > 
> > So if I understand correctly, the question is: is it possible that
> > we have no symbol table, but have an allocatable `.strtab` in the object?
> > 
> > If it is, this code might either use the `.shstrtab` instead of existent `.strtab`,
> > or perform an attemp to create a new string table (if there is no `.shstrtab` to reuse) it seems.
> > Then it will create a `.symtab` using the string table either found or created instead of the existent allocatable '.strtab'.
> > Does not sound correct perhaps?
> > 
> Both SHF_ALLOC .strtab and non-SHF_ALLOC .strtab are possible, though I don't think ld or objcopy could create SHF_ALLOC .strtab.
> 
> Generally objcopy should not alter SHF_ALLOC sections. The code skips both .dynstr (SHF_ALLOC) and SHF_ALLOC .strtab.  SHF_ALLOC .strtab is not tested, though.
Yes, exactly. Repro:

```
$ cat strtab.yaml
--- !ELF
FileHeader:
  Class:   ELFCLASS64
  Data:    ELFDATA2LSB
  Type:    ET_REL
  Machine: EM_X86_64
Sections:
  - Name: .strtab
    Type: SHT_STRTAB
    Flags: [ SHF_ALLOC ]
$ yaml2obj strtab.yaml > strtab.o
$ llvm-objcopy --add-symbol=abs1=1 strtab.o strtab2.o
$ llvm-readobj --sections strtab2.o
...
  Section {
    Index: 1
    Name: .strtab (11)
    Type: SHT_STRTAB (0x3)
    Flags [ (0x2)
      SHF_ALLOC (0x2)
    ]
  }
  Section {
    Index: 2
    Name: .shstrtab (1)
    Type: SHT_STRTAB (0x3)
    Flags [ (0x0)
    ]
  }
  Section {
    Index: 3
    Name: .symtab (19)
    Type: SHT_SYMTAB (0x2)
    Flags [ (0x0)
    ]
    Link: 2  # <-- .shstrtab, not .strtab
  }
...
```

Running the same example through a recent GNU objcopy, it looks like it creates a second `.strtab` without the `SHF_ALLOC` bit and links to that:

```
$ llvm-objcopy --add-symbol=abs1=1 strtab.o strtab2.o
$ llvm-readobj --sections strtab2.o
  Section {
    Index: 1
    Name: .strtab (9)
    Type: SHT_STRTAB (0x3)
    Flags [ (0x2)
      SHF_ALLOC (0x2)
    ]
  }
  Section {
    Index: 2
    Name: .symtab (1)
    Type: SHT_SYMTAB (0x2)
    Flags [ (0x0)
    ]
    Link: 3 # <-- non-SHF_ALLOC .strtab below, not the SHF_ALLOC .strtab above
  }
  Section {
    Index: 3
    Name: .strtab (9)
    Type: SHT_STRTAB (0x3)
    Flags [ (0x0)
    ]
  }
  Section {
    Index: 4
    Name: .shstrtab (17)
    Type: SHT_STRTAB (0x3)
    Flags [ (0x0)
    ]
  }
```

This behavior seems reasonable to me vs re-using the .shstrtab for strings that aren't section headers.

> To confirm, I think ... && !(Sec.Flags & SHF_ALLOC) is correct here.
> 
BTW, I don't find it particularly constructive to simply state a position without any rationale.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69093/new/

https://reviews.llvm.org/D69093





More information about the llvm-commits mailing list