[PATCH] D129107: [BOLT][HUGIFY] adds huge pages support of PIE/no-PIE binaries

Rafael Auler via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 8 17:09:14 PDT 2022


rafauler added inline comments.


================
Comment at: bolt/lib/Rewrite/RewriteInstance.cpp:494-496
+  // Hugify: Additional huge page from left side
+  if (opts::Hugify)
+    NextAvailableAddress += BC->PageAlign;
----------------
yavtuk wrote:
> rafauler wrote:
> > Why is that needed?
> It's needed due to HUGEPAGE allocation policy and also due to the bug for old kernels where dynamic loader doesn't take into account p_align field.
> Dynamic loader allocates and maps the segments sequentially with 4KB addresses alignment. If we want to get HUGEPAGE from OS we have to have the address for page with 2MB alignment. For that, I add padding from left and right sides in order to exclude overlapping between segments.
>From the left side, this is already aligned via BinaryContext::PageAlign. This is not just setting p_align, but actually setting the start address to be aligned at 2MB boundary.  So this line here is inserting an extra empty 2MB page, but I'm not sure I get the reason why.




================
Comment at: bolt/lib/Rewrite/RewriteInstance.cpp:3626-3628
+        // Hugify: Additional huge page from right side
+        if (opts::Hugify)
+          Address = alignTo(Address, Section->getAlignment());
----------------
rafauler wrote:
> Same
For the right side, this alignment is accomplished by lines RewriteInstance.cpp:3635 (using this diff's lines), where we  pad the end of the code section until it is aligned at 2MB. I understand that other code sections might be allocated to the huge page if we don't have these lines added by this diff, but I'm not sure why is that a problem. If you have space left in a huge page, why wouldn't you put code there?


================
Comment at: bolt/runtime/common.h:82
 extern "C" {
-void *memcpy(void *Dest, const void *Src, size_t Len) {
+void __attribute__((noinline)) *
+    memcpy(void *Dest, const void *Src, size_t Len) {
----------------
yavtuk wrote:
> rafauler wrote:
> > Why is that needed?
> good question :-) the user-func-reoder test fails and it was hard to reproduce the cause locally
> since it's related to compiler
> 
> with this attribute we have the following assembly for memcpy:
> 
> .Loop:
>     ...
>     movzbl (%rsi,%rdi,1),%ecx
>     mov    %cl,(%rax,%rdi,1)
>     add    $0x1,%rdi
>     cmp    %rdi,%r9
>     jne    a004a0 <_fini+0x2c4>
>     ...
> 
> 
>     mov    %r14,%rdi
>     mov    %r15,%rsi
>     mov    %rbx,%rdx
>     callq  .Loop
> 
> copying is performed by byte with verification
> 
> without this attribute I see the following:
> 
> .Loop:
>     ... 
>     movzbl 0x0(%r13,%rax,1),%edx
>     mov    %dl,(%rbx,%rax,1)
>     movzbl 0x1(%r13,%rax,1),%edx
>     mov    %dl,0x1(%rbx,%rax,1)
>     movzbl 0x2(%r13,%rax,1),%edx
>     mov    %dl,0x2(%rbx,%rax,1)
>     movzbl 0x3(%r13,%rax,1),%edx
>     mov    %dl,0x3(%rbx,%rax,1)
>     movzbl 0x4(%r13,%rax,1),%edx
>     mov    %dl,0x4(%rbx,%rax,1)
>     movzbl 0x5(%r13,%rax,1),%edx
>     mov    %dl,0x5(%rbx,%rax,1)
>     movzbl 0x6(%r13,%rax,1),%edx
>     mov    %dl,0x6(%rbx,%rax,1)
>     movzbl 0x7(%r13,%rax,1),%edx
>     mov    %dl,0x7(%rbx,%rax,1)
>     add    $0x8,%rax
>     cmp    %rax,%rcx
>     jne    a007f0 <_fini+0x614>
> 
> copying is performed with unrolling and test fails due to overlapping dst and src addresses for size which is not aligned to 8 bytes
> 
>  
> 
Ok, I debugged user-func-reoder and noticed that this patch is doing something else. Instead of copying the entire contents of the huge page, it is copying only hot functions. I go back to my original point, why do we need to avoid overlapping segments? If you roll back to previous behavior (copying all contents of the page, including cold text segments), you won't need to insert one extra alignment at the end of each code section.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129107/new/

https://reviews.llvm.org/D129107



More information about the llvm-commits mailing list