[llvm-dev] [RFC] Moving RELRO segment

Fri Aug 30 04:54:08 PDT 2019

> > Old: R RX RW(RELRO) RW
> > New: R(R+RELRO) RX RW;      R includes the traditional R part and the
> > RELRO part
> > Runtime (before relocation resolving): RW RX RW
> > Runtime (after relocation resolving): R RX RW
> >
> I actually see two ways of implementing this, and yes what you mentioned
> here is one of them:
>   1. Move RELRO to before RX, and merge it with R segment. This is what
you
> said above.
>   2. Move RELRO to before RX, but keep it as a separate segment. This is
> what I implemented in my test.
> As I mentioned in my reply to Peter, option 1 would allow existing
> implementations to take advantage of this without any change. While I
think
> this optimization is well worth it, if we go with option 1, the dynamic
> linkers won't have a choice to keep RO separate if they want to for
> whatever reason (e.g. less VM commit, finer granularity in VM maps, not
> wanting to have RO as writable even if for a short while.) So there's a
> trade-off to be made here (or an option to be added, even though we all
> want to avoid that if we can.)

Then you probably meant:

Old: R RX RW(RELRO) RW
New: R | RW(RELRO) RX RW
Runtime (before relocation resolving): R RW RX RW
Runtime (after relocation resolving): R R RX RW   ; the two R cannot be
merged

| means a maxpagesize alignment. I am not sure whether you are going to add
it
because I still do not understand where the saving comes from.

If the alignment is added, the R and RW maps can get contiguous
(non-overlapping) p_offset ranges. However, the RW map is private dirty,
it cannot be merged with adjacent maps so I am not clear how it can save
kernel memory.

If the alignment is not added, the two maps will get overlapping p_offset
ranges.

> My test showed an overall ~1MB decrease in kernel slab memory usage on
> vm_area_struct, with about 150 processes running. For this to work, I had
> to modify the dynamic linker:

Can you elaborate how this decreases the kernel slab memory usage on
vm_area_struct?  References to source code are very welcomed :) This is
contrary to my intuition because the second R is private dirty.  The number
of
VMAs do not decrease.

>   1. The dynamic linker needs to make the read-only VMA briefly writable
in
> order for it to have the same VM flags with the RELRO VMA so that they can
> be merged. Specifically VM_ACCOUNT is set when a VMA is made writable.

Same question. I hope you can give a bit more details.

> > How to layout the segments if --no-rosegment is specified?
> > Runtime (before relocation resolving): RX RW   ;      some people may be
> > concered with writable stuff (relocated part) being made executable
> Indeed I think weakening in the security aspect may be a problem if we are
> to merge RELRO into RX. Keeping the old layout would be more
> preferable IMHO.

This means the new layout conflicts with --no-rosegment.
In Driver.cpp, there should be a "... cannot be used together" error.

> > Another problem is that in the default -z relro -z lazy (-z now not
> > specified) layout, .got and .got.plt will be separated by potentially
huge
> > code sections (e.g. .text). I'm still thinking what problems this layout
> > change may bring.
> >
> Not sure if this is the same issue as what you mentioned here, but I also
> see a comment in lld/ELF/Writer.cpp about how .rodata and .eh_frame should
> be as close to .text as possible due to fear of relocation overflow. If we
> go with option 2 above, the distance would have to be made larger. With
> option 1, we may still have some leeway in how to order sections within
the
> merged RELRO segment.

For huge executables (>2G or 3G), it may cause relocation overflows
between .text and .rodata if other large sections like .dynsym and .dynstr
are
placed in between.

I do not worry too much about overflows potentially caused by moving
PT_GNU_RELRO around.  PT_GNU_RELRO is usually less than 10% of the size of
the
RX PT_LOAD.

> This would be a somewhat tedious change (especially the part about having
> to update all the unit tests), but the benefit is pretty good, especially
> considering the kernel slab memory is not swappable/evictable. Please let
> me know your thoughts!

Definitely! I have prototyped this and find ~260 tests will need address
changing..
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190830/c8766a94/attachment.html>