[llvm-dev] Reducing code size of Position Independent Executables (PIE) by shrinking the size of dynamic relocations section

Mon Dec 11 10:41:08 PST 2017

On Sat, Dec 9, 2017 at 3:06 PM, Florian Weimer <fw at deneb.enyo.de> wrote:
> * Rahul Chaudhry via gnu-gabi:
>
>> The encoding used is a simple combination of delta-encoding and a
>> bitmap of offsets. The section consists of 64-bit entries: higher
>> 8-bits contain delta since last offset, and lower 56-bits contain a
>> bitmap for which words to apply the relocation to. This is best
>> described by showing the code for decoding the section:
>>
>> typedef struct
>> {
>>   Elf64_Xword  r_data;  /* jump and bitmap for relative relocations */
>> } Elf64_Relrz;
>>
>> #define ELF64_R_JUMP(val)    ((val) >> 56)
>> #define ELF64_R_BITS(val)    ((val) & 0xffffffffffffff)
>>
>> #ifdef DO_RELRZ
>>   {
>>     ElfW(Addr) offset = 0;
>>     for (; relative < end; ++relative)
>>       {
>>         ElfW(Addr) jump = ELFW(R_JUMP) (relative->r_data);
>>         ElfW(Addr) bits = ELFW(R_BITS) (relative->r_data);
>>         offset += jump * sizeof(ElfW(Addr));
>>         if (jump == 0)
>>           {
>>             ++relative;
>>             offset = relative->r_data;
>>           }
>>         ElfW(Addr) r_offset = offset;
>>         for (; bits != 0; bits >>= 1)
>>           {
>>             if ((bits&1) != 0)
>>               elf_machine_relrz_relative (l_addr, (void *) (l_addr + r_offset));
>>             r_offset += sizeof(ElfW(Addr));
>>           }
>>       }
>>   }
>> #endif
>
> That data-dependent “if ((bits&1) != 0)” branch looks a bit nasty.
>
> Have you investigated whether some sort of RLE-style encoding would be
> beneficial? If there are blocks of relative relocations, it might even
> be possible to use vector instructions to process them (although more
> than four relocations at a time are probably not achievable in a
> power-efficient manner on current x86-64).

Yes, we originally investigated RLE style encoding but I guess the key
insight which led us towards the proposed encoding is the following.
The offset addresses which contain the relocations are close but not
necessarily contiguous.  We experimented with an encoding strategy
where we would store the initial offset and the number of words after
that which contained dynamic relocations.  This gave us good
compression numbers but the proposed scheme was way better.  I will
let Rahul say more as he did quite a bit of experiments with different
strategies.

Thanks
Sri