[llvm-dev] [RFC] Moving RELRO segment
David Chisnall via llvm-dev
llvm-dev at lists.llvm.org
Fri Aug 30 05:27:51 PDT 2019
On 28/08/2019 18:58, Vic (Chun-Ju) Yang via llvm-dev wrote:
> This is an RFC for moving RELRO segment. Currently, lld orders ELF
> sections in the following order: R, RX, RWX, RW, and RW contains RELRO.
> At run time, after RELRO is write-protected, we'd have VMAs in the order
> of: R, RX, RWX, R (RELRO), RW. I'd like to propose that we move RELRO to
> be immediately after the read-only sections, so that the order of VMAs
> become: R, R (RELRO), RX, RWX, RW, and the dynamic linker would have the
> option to merge the two read-only VMAs to reduce bookkeeping costs.
I am not convinced by this change. With current hardware, to make any
mapping more efficient, you need both the virtual to physical
translation and the permissions to be the same.
Anything that is writeable at any point will be a CoW mapping that, when
written, will be replaced by a different page. Anything that is not
ever writeable will be the same physical pages. This means that the old
order is (S for shared, P for private):
S S P P
The new order is:
S P S P P
This means that the translation for the shared part is *definitely* not
contiguous. Modern architectures currently (though not necessarily
indefinitely) conflate protection and translation and so both versions
require the same number of page table and TLB entries.
This; however, is true only when you think about single-level
translation. When you consider nested paging in a VM, things get more
complex because the translation is a two-stage lookup and the protection
is based on the intersection of the permissions at each level.
The hypervisor will typically try to use superpages for the second-level
translation and so both of the shared pages have a high probability of
hitting in the same PTE for the second-level translation. The same is
true for the RW and RELRO segments, because they will be allocated at
the same time and any OS that does transparent superpage promotion (I
think Linux does now? FreeBSD has for almost a decade) will therefore
try to allocate contiguous physical memory for the mappings if possible.
I would expect your scheme to translate to more memory traffic from
page-table walks in any virtualised environment and I don't see (given
that you have increased address space fragmentation) where you are
seeing a saving. With RELRO as part of RW, the kernel is free to split
and recombine adjacent VM objects, with the new layout it is not able to
combine adjacent objects because they are backed by different storage.
More information about the llvm-dev