[PATCH] D129107: [BOLT][HUGIFY] adds huge pages support of PIE/no-PIE binaries

Alexey Moksyakov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 28 05:34:47 PDT 2022


yavtuk added a comment.

In D129107#3889977 <https://reviews.llvm.org/D129107#3889977>, @rafauler wrote:

> Thanks for working on this! Let's sync one last time our understanding of the implementation of hugifyForOldKernel. Sorry if being repetitive, but it is important now to be on the same page regarding what is happening during runtime in both PIC and no-PIC cases. See if you agree with me with respect to the AlignedFrom/AlignedTo/AlignedSize usages in the suggestions, and please point me any issues in my understanding.
>
> If we look at hugifyForOldKernel(), the code suggested here currently copies only a part of the page that is determined by From, To.  We now know that "From", because of ASLR ignoring our alignment requirements, may not be aligned. Now suppose it lands in the middle of the page and that "To" (the end of hot code section) lands in the middle of the next page.
>
> 2MB huge page virtual memory map:
>
> page1 - 0x400000:
> hot start: 0x500000
> page2 - 0x600000:
> hot end:  0x700000
> page 3 - 0x800000
>
> In this case, according to lines 146-149, you will align "hot start" and "hot end"  to 0x400000 and 0x800000, respectively, and ask the kernel to unmap these pages. So you will be unmapping 4MB of code.  However, the code will be memcpy-ing 2MB of code from 0x500000 to 0x700000, and then copying it back after the kernel successfully mmaps the requested region into two huge pages.
>
> Now, because you inserted extra padding in RewriteInstance:cpp:103 and 308, the fact that you are leaving 1MB before hot start not copied, and 1MB after hot end as well, is not really a problem.
>
> However, in the non-PIC code, we are not inserting any extra padding. After hot end, at address 0x700000, we will have a large amount of code (coming from cold code of hot functions, those that were split). We will also have a bunch of extra code including the hugify runtime library itself, in some cases.
>
> If we memcpy from 0x400000 to 0x800000 instead of the original 500000 to 700000, we will be erring on the safe side by always copying any memory contents that are being essentially erased after you ask the kernel to unmap them. That's why when using this mmap calls, we typically copy all page contents instead of just a subset of the (hot) bytes. It's also safe to reference these memory addresses (from 700000 to 800000) without the risk of segfaulting because BOLT will always pad the last code section in no-PIE -- the padding won't be correct for PIE because ASLR loader will misalign the start, but luckily we are inserting one extra page at the end in these cases, so the addresses from 700000 to 800000 will be filled with zeroes and won't segfault.
>
> What you did in the last iteration was to expand hot_end towards one extra 4KB page, but that is not enough as in line 115 we are asking the kernel to unmap whole 2MB regions of text.
>
> Does that sound reasonable? anything I'm missing?

Yes, you are right, thank you for clarifying, part of the problem was with copying to the time area, changed the size at 97 line and removed the redundant function argument. Added simple test for both types of binaries.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129107/new/

https://reviews.llvm.org/D129107



More information about the llvm-commits mailing list