[PATCH] D137620: [BOLT] Don't align .text to pageAlign

Mon Nov 21 09:52:32 PST 2022

treapster added a comment.

In D137620#3934837 <https://reviews.llvm.org/D137620#3934837>, @maksfb wrote:

> In D137620#3934664 <https://reviews.llvm.org/D137620#3934664>, @treapster wrote:
>
>>> To me, it appears there are two outstanding issues that we need to solve. One is that we use large pages by default. Without hugify/hottext it doesn't make sense unless dynamic loader supports large pages.
>>
>> My understanding is that newer kernels/loaders support large pages and can properly map them without the use of -hugify, so it makes sense to use large pages by default. I wouldn't call it an issue, if you don't like it you can always use -no-huge-pages.
>
> Could you point to the kernel patch/version that enables it? It will be good to test the functionality.

I researched a bit and can say the following things:

- I assumed the problem with hugify is just that older ld-linux or kernels ignored 0x200000 alignment of the segment, but turned out it is just part of the problem. It seems like it's caused by the loader, and with glibc 2.28(4.18.0 EulerOS) aligment is ignored, while on newer Ubuntu 5.10.0-1013-oem with glibc 2.31 addresses are assigned properly.
- I was wrong about out-of-box HugePage support, because aside from big alignment HugePages must be <https://docs.kernel.org/admin-guide/mm/transhuge.html> anonymous. Which means we need to use -hugify to insert code which will remap code section as anonymous and call madvise to ensure huge page. It also means that current implementation of -hugify is useless on newer kernels because if addresses are properly aligned, it does not create anonymous mapping and calls madvise right away, which does not return huge page(it can be checked by `cat /proc/PID/smaps | grep HugePage`, there will be all zeroes).

>> In D137620#3934504 <https://reviews.llvm.org/D137620#3934504>, @maksfb wrote:
>>
>>> Just using 64-byte alignment and relying on the containing segment being aligned to the page size will not always work. One example of the latter is "-use-old-text".
>>
>> Why it will not work with -use-old-text? The whole point of this option is to put new text in place of old, and by trying to align it to 2 MB you make it unachievable unless the binary was prelinked with huge padding.
>
> What exactly is "unachievable"? For one of our apps, we use ICF to save ~20MB of code and the new code fits into the original space with 2MB alignment.

Unachievable was wrong word, but on binaries i've seen hot + cold is usually a bit bigger than original .text even without alignment, so adding padding reduces chances to fit even more. 20MB saved by ICF must be a huge binary and relatively rare case. Anyway, it's not the matter here.

>> I think this patch just makes defaults more reasonable, otherwise you can always align however you want.
>
> From the performance perspective, the default alignment of 64 bytes for .text does not sound reasonable to me . It should be page-aligned.

Aligning to regular page align seems reasonable in case of use-old-text(if we accept that for most mid-sized binaries it's not possible), even though i'm not sure it significantly affects performance without huge pages, but if we create new segment, 64 bytes seems fine, isn't it?

>> And i also don't understand why MaxAlignBytes might be better
>
> It has an advantage to explicitly express the intention of not using more than MaxAlignBytes for alignment. The current patch has the assumption of having "small" program header before .text in a page-aligned segment which makes it fragile.

But we care more about alignment value than padding size. I think we're better off removing the assumption by explicitly handling use-old-text, instead of introducing new option.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137620/new/

https://reviews.llvm.org/D137620