[PATCH] D69093: [llvm-objcopy] --add-symbol: fix crash if SHT_SYMTAB does not exist

Fangrui Song via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 21 11:15:10 PDT 2019


MaskRay marked an inline comment as done.
MaskRay added inline comments.


================
Comment at: llvm/tools/llvm-objcopy/ELF/Object.cpp:1550-1552
+        // Prefer .strtab to .shstrtab.
+        if (Obj.SectionNames != &Sec)
+          break;
----------------
rupprecht wrote:
> rupprecht wrote:
> > MaskRay wrote:
> > > grimar wrote:
> > > > grimar wrote:
> > > > > MaskRay wrote:
> > > > > > abrachet wrote:
> > > > > > > rupprecht wrote:
> > > > > > > > MaskRay wrote:
> > > > > > > > > rupprecht wrote:
> > > > > > > > > > Should `.dynstr` also be avoided? Is it better to just check that the name is literally `.strtab`?
> > > > > > > > > .dynstr is avoided by `!(Sec.Flags & SHF_ALLOC)`. `.strtab` works as well but `Obj.SectionNames != &Sec` is more general. There are multiple ways to accomplish the samething. I just wanted to make the code less magical.
> > > > > > > > From: http://www.sco.com/developers/gabi/latest/ch4.sheader.html#special_sections
> > > > > > > > > .strtab
> > > > > > > > > This section holds strings, most commonly the strings that represent the names associated with symbol table entries. If the file has a loadable segment that includes the symbol string table, the section's attributes will include the SHF_ALLOC bit; otherwise, that bit will be off.
> > > > > > > > 
> > > > > > > > So, `.strtab` can also be `SHF_ALLOC`.
> > > > > > > For reference, we used to have a file in test/tools/llvm-objcopy/ELF/Inputs/ called alloc-symtab which had an SHF_ALLOC .strtab. It was removed in D65278. 
> > > > > > To confirm, I think `... && !(Sec.Flags & SHF_ALLOC)` is correct here.
> > > > > > For reference, we used to have a file in test/tools/llvm-objcopy/ELF/Inputs/ called alloc-symtab which had an SHF_ALLOC .strtab. It was removed in D65278.
> > > > > 
> > > > > Thanks for the reference. I completely forgot that saw an allocatable `.symtab` that time.
> > > > > 
> > > > > To clarify: in D65278 I've replaced a precompiled binary with a YAML description (with a SHF_ALLOC `.symtab`).
> > > > > The logic of tests shouldn't have been affected, because tests wanted to see the SHF_ALLOC symbol table (but not SHF_ALLOC string table)
> > > > > I believe.
> > > > > 
> > > > > If we need a test with a SHF_ALLOC `.strtab`, it should be possible to craft one with a yaml2obj too I think.
> > > > > (I haven't check, but I think all that needed is to define a string table section and a `Flag` property which should override the default).
> > > > > 
> > > > > So, .strtab can also be SHF_ALLOC.
> > > > 
> > > > > To confirm, I think ... && !(Sec.Flags & SHF_ALLOC) is correct here.
> > > > 
> > > > To summarize and clarify this a bit:
> > > > This part of code is called when there is no symbol table (`.symtab`).
> > > > Jordan mentioned that `.strtab` can be allocatable in according to specification.
> > > > 
> > > > So if I understand correctly, the question is: is it possible that
> > > > we have no symbol table, but have an allocatable `.strtab` in the object?
> > > > 
> > > > If it is, this code might either use the `.shstrtab` instead of existent `.strtab`,
> > > > or perform an attemp to create a new string table (if there is no `.shstrtab` to reuse) it seems.
> > > > Then it will create a `.symtab` using the string table either found or created instead of the existent allocatable '.strtab'.
> > > > Does not sound correct perhaps?
> > > > 
> > > Both SHF_ALLOC .strtab and non-SHF_ALLOC .strtab are possible, though I don't think ld or objcopy could create SHF_ALLOC .strtab.
> > > 
> > > Generally objcopy should not alter SHF_ALLOC sections. The code skips both .dynstr (SHF_ALLOC) and SHF_ALLOC .strtab.  SHF_ALLOC .strtab is not tested, though.
> > Yes, exactly. Repro:
> > 
> > ```
> > $ cat strtab.yaml
> > --- !ELF
> > FileHeader:
> >   Class:   ELFCLASS64
> >   Data:    ELFDATA2LSB
> >   Type:    ET_REL
> >   Machine: EM_X86_64
> > Sections:
> >   - Name: .strtab
> >     Type: SHT_STRTAB
> >     Flags: [ SHF_ALLOC ]
> > $ yaml2obj strtab.yaml > strtab.o
> > $ llvm-objcopy --add-symbol=abs1=1 strtab.o strtab2.o
> > $ llvm-readobj --sections strtab2.o
> > ...
> >   Section {
> >     Index: 1
> >     Name: .strtab (11)
> >     Type: SHT_STRTAB (0x3)
> >     Flags [ (0x2)
> >       SHF_ALLOC (0x2)
> >     ]
> >   }
> >   Section {
> >     Index: 2
> >     Name: .shstrtab (1)
> >     Type: SHT_STRTAB (0x3)
> >     Flags [ (0x0)
> >     ]
> >   }
> >   Section {
> >     Index: 3
> >     Name: .symtab (19)
> >     Type: SHT_SYMTAB (0x2)
> >     Flags [ (0x0)
> >     ]
> >     Link: 2  # <-- .shstrtab, not .strtab
> >   }
> > ...
> > ```
> > 
> > Running the same example through a recent GNU objcopy, it looks like it creates a second `.strtab` without the `SHF_ALLOC` bit and links to that:
> > 
> > ```
> > $ llvm-objcopy --add-symbol=abs1=1 strtab.o strtab2.o
> > $ llvm-readobj --sections strtab2.o
> >   Section {
> >     Index: 1
> >     Name: .strtab (9)
> >     Type: SHT_STRTAB (0x3)
> >     Flags [ (0x2)
> >       SHF_ALLOC (0x2)
> >     ]
> >   }
> >   Section {
> >     Index: 2
> >     Name: .symtab (1)
> >     Type: SHT_SYMTAB (0x2)
> >     Flags [ (0x0)
> >     ]
> >     Link: 3 # <-- non-SHF_ALLOC .strtab below, not the SHF_ALLOC .strtab above
> >   }
> >   Section {
> >     Index: 3
> >     Name: .strtab (9)
> >     Type: SHT_STRTAB (0x3)
> >     Flags [ (0x0)
> >     ]
> >   }
> >   Section {
> >     Index: 4
> >     Name: .shstrtab (17)
> >     Type: SHT_STRTAB (0x3)
> >     Flags [ (0x0)
> >     ]
> >   }
> > ```
> > 
> > This behavior seems reasonable to me vs re-using the .shstrtab for strings that aren't section headers.
> > 
> > > To confirm, I think ... && !(Sec.Flags & SHF_ALLOC) is correct here.
> > > 
> > BTW, I don't find it particularly constructive to simply state a position without any rationale.
> > Running the same example through a recent GNU objcopy, it looks like it creates a second .strtab without the SHF_ALLOC bit and links to that:
> > 
> > $ llvm-objcopy --add-symbol=abs1=1 strtab.o strtab2.o
> > $ llvm-readobj --sections strtab2.o
> > 
> This second example is actually what happens w/ GNU objcopy, not llvm-objcopy.
> 
Reusing `.shstrtab` is what `SEC3` tests. It may look a bit strange. `e_shstrndx` may refer to `.strtab` (clang reuses the section for both section names and symbol names). If we want to behave like GNU objcopy, we will have to special case the string `.strtab`. In any case, I don't think the (1) .symtab does not exist (2) --add-symbol= causes .symtab to be created   edge case needs more treatment for GNU compatibility.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69093/new/

https://reviews.llvm.org/D69093





More information about the llvm-commits mailing list