[PATCH] D64200: [ELF] Allow placing non-string SHF_MERGE sections with different alignments into the same MergeSyntheticSection

Thu Jul 4 06:14:55 PDT 2019

MaskRay added a comment.

In D64200#1570259 <https://reviews.llvm.org/D64200#1570259>, @peter.smith wrote:

> In D64200#1570169 <https://reviews.llvm.org/D64200#1570169>, @MaskRay wrote:
>
> > > That would (1) waste space and (2) create unaligned sections when tail merge (-O2) is enabled. MOVAPS on such unaligned strings can raise SIGSEGV.
> >
> > I haven't figured out the root cause of (2) yet.
> >
> > (1) justifies a change, otherwise we will get something like `a.......b.......c.......`, which wastes space.
>
>
> I hadn't realised that the change would also affect SHF_STRINGS sections as well, IIRC sh_entsize means the size of the character so it can make sense for the alignment to be higher than the sh_entsize. This is bringing back memories. Many years ago we had a similar problem mixing objects from Arm's proprietary compiler and arm-none-eabi-gcc. The ARM compiler only used SHF_STRINGS with sh_align 1 with the code using byte by byte accesses to those strings. In arm-none-eabi-gcc the SHF_STRINGS has sh_align 4 and copied strings using words instead of bytes. Within the SHF_STRINGS section the strings were padded to a 4-byte boundary. On the ARM processors of the time didn't support unaligned accesses so when the strings were merged without taking into account alignment padding the resulting binary crashed.

D63432 <https://reviews.llvm.org/D63432> had the very issue. A MergeTailSection was created with alignment 1. In its ctor the StringTableBuilder was initialized with alignment 1. Then, the MergeTailSection's alignment was updated to 16 but StringTableBuilder::Alignment was not updated. StringTableBuilder actually can handle strings with alignment greater than 1 (though some strings may not be mergeable, e.g. "abc" and "bc" can be merged if sh_addralign is 1 but can't if sh_addralign is 2).

> In that particular case we permitted strings from an equal to or lower alignment to be merged, but not a higher to a lower alignment. Within that higher aligned string section we needed to maintain alignment padding to ensure each string started at an aligned boundary. This might be more difficult to do in LLD.

Agreed.

> I think for now excluding SHF_STRINGS from different alignment merging is the right thing to do. If we want to enable it we'll need to ensure each string starts on a correctly aligned boundary to ensure correctness.

I agree. `tail-merge-string-align2.s` demonstrates that different alignments on SHF_STRINGS can create too much padding.

@davezarzycki  Thank you for testing!

Repository:
  rLLD LLVM Linker

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64200/new/

https://reviews.llvm.org/D64200