[llvm-dev] [RFC] Moving RELRO segment

Tue Sep 3 10:40:43 PDT 2019

On Fri, Aug 30, 2019 at 4:54 AM Fāng-ruì Sòng <maskray at google.com> wrote:

> > > Old: R RX RW(RELRO) RW
> > > New: R(R+RELRO) RX RW;      R includes the traditional R part and the
> > > RELRO part
> > > Runtime (before relocation resolving): RW RX RW
> > > Runtime (after relocation resolving): R RX RW
> > >
> > I actually see two ways of implementing this, and yes what you mentioned
> > here is one of them:
> >   1. Move RELRO to before RX, and merge it with R segment. This is what
> you
> > said above.
> >   2. Move RELRO to before RX, but keep it as a separate segment. This is
> > what I implemented in my test.
> > As I mentioned in my reply to Peter, option 1 would allow existing
> > implementations to take advantage of this without any change. While I
> think
> > this optimization is well worth it, if we go with option 1, the dynamic
> > linkers won't have a choice to keep RO separate if they want to for
> > whatever reason (e.g. less VM commit, finer granularity in VM maps, not
> > wanting to have RO as writable even if for a short while.) So there's a
> > trade-off to be made here (or an option to be added, even though we all
> > want to avoid that if we can.)
>
> Then you probably meant:
>
> Old: R RX RW(RELRO) RW
> New: R | RW(RELRO) RX RW
> Runtime (before relocation resolving): R RW RX RW
> Runtime (after relocation resolving): R R RX RW   ; the two R cannot be
> merged
>
> | means a maxpagesize alignment. I am not sure whether you are going to
> add it
> because I still do not understand where the saving comes from.
>

> If the alignment is added, the R and RW maps can get contiguous
> (non-overlapping) p_offset ranges. However, the RW map is private dirty,
> it cannot be merged with adjacent maps so I am not clear how it can save
> kernel memory.
>

My understanding (and my test result shows so) is that two VMAs can be
merged even when one of them contains dirty pages. As far as I can tell
from reading vma_merge() in mm/mmap.c in Linux kernel, there's nothing
preventing merging consecutively mmaped regions in that case. That said, we
may not care about this case too much if we decide that this change should
be put behind a flag, because in that case, I think we can just go with
option 1.

>
> If the alignment is not added, the two maps will get overlapping p_offset
> ranges.
>
> > My test showed an overall ~1MB decrease in kernel slab memory usage on
> > vm_area_struct, with about 150 processes running. For this to work, I had
> > to modify the dynamic linker:
>
> Can you elaborate how this decreases the kernel slab memory usage on
> vm_area_struct?  References to source code are very welcomed :) This is
> contrary to my intuition because the second R is private dirty.  The
> number of
> VMAs do not decrease.
>
In mm/mprotect.c, merging is done in mprotect_fixup(), which calls
vma_merge() to do the actual work. In the same function you can also see
VM_ACCOUNT flag is set for writable VMA, which is why I had to modify the
dynamic linker to make R section temporarily writable for it to be
mergeable with RELRO (they need to have the same flags to be merged.)
Again, IMO all these somewhat indirect manipulations of VMAs were because I
was hoping to give the dynamic linker an option to choose whether to take
advantage of this or not. If for any reason, we put this behind a build
time flag, there's no reason to jump through these hoops instead of just
going with option 1.

>
> >   1. The dynamic linker needs to make the read-only VMA briefly writable
> in
> > order for it to have the same VM flags with the RELRO VMA so that they
> can
> > be merged. Specifically VM_ACCOUNT is set when a VMA is made writable.
>
> Same question. I hope you can give a bit more details.
>
> > > How to layout the segments if --no-rosegment is specified?
> > > Runtime (before relocation resolving): RX RW   ;      some people may
> be
> > > concered with writable stuff (relocated part) being made executable
> > Indeed I think weakening in the security aspect may be a problem if we
> are
> > to merge RELRO into RX. Keeping the old layout would be more
> > preferable IMHO.
>
> This means the new layout conflicts with --no-rosegment.
> In Driver.cpp, there should be a "... cannot be used together" error.
>
> > > Another problem is that in the default -z relro -z lazy (-z now not
> > > specified) layout, .got and .got.plt will be separated by potentially
> huge
> > > code sections (e.g. .text). I'm still thinking what problems this
> layout
> > > change may bring.
> > >
> > Not sure if this is the same issue as what you mentioned here, but I also
> > see a comment in lld/ELF/Writer.cpp about how .rodata and .eh_frame
> should
> > be as close to .text as possible due to fear of relocation overflow. If
> we
> > go with option 2 above, the distance would have to be made larger. With
> > option 1, we may still have some leeway in how to order sections within
> the
> > merged RELRO segment.
>
> For huge executables (>2G or 3G), it may cause relocation overflows
> between .text and .rodata if other large sections like .dynsym and .dynstr
> are
> placed in between.
>
> I do not worry too much about overflows potentially caused by moving
> PT_GNU_RELRO around.  PT_GNU_RELRO is usually less than 10% of the size of
> the
> RX PT_LOAD.
>
That's good to know!

>
> > This would be a somewhat tedious change (especially the part about having
> > to update all the unit tests), but the benefit is pretty good, especially
> > considering the kernel slab memory is not swappable/evictable. Please let
> > me know your thoughts!
>
> Definitely! I have prototyped this and find ~260 tests will need address
> changing..
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190903/840b6f6f/attachment.html>