[PATCH] D128382: [LLD] Two tweaks to symbol ordering scheme

Fangrui Song via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 7 21:23:51 PDT 2022


MaskRay added a comment.

In D128382#3623746 <https://reviews.llvm.org/D128382#3623746>, @yozhu wrote:

> In D128382#3620661 <https://reviews.llvm.org/D128382#3620661>, @MaskRay wrote:
>
>> In D128382#3611295 <https://reviews.llvm.org/D128382#3611295>, @yozhu wrote:
>>
>>> In D128382#3611207 <https://reviews.llvm.org/D128382#3611207>, @MaskRay wrote:
>>>
>>>> In D128382#3611193 <https://reviews.llvm.org/D128382#3611193>, @yozhu wrote:
>>>>
>>>>> In D128382#3610276 <https://reviews.llvm.org/D128382#3610276>, @MaskRay wrote:
>>>>>
>>>>>> In D128382#3609648 <https://reviews.llvm.org/D128382#3609648>, @lanza wrote:
>>>>>>
>>>>>>> One interesting note Shoaib and I just discovered is that ~16.5% of the libraries in one of our major apps have `.text` sections starting on the first page of the dso. Placing hot sections at the front of the `.text` section would give us a slice of a "free page" that, with the current algorithm, would just be cold code that's less likely to be used. I imagine this isn't a substantial difference but it's probably slightly in favor of placing hot code first here.
>>>>>>
>>>>>> Thanks. This is fine, even if not that large. The patch should use `osec->flags & SHF_EXECINSTR` instead of `executableInputSections` for simplification. Then the complexity/benefit trade-off will look fine.
>>>>>
>>>>> I didn't check output section's attribute but instead check on each input section because:
>>>>>
>>>>> - It could be that the output section contains .text (RX) contributions as well as .rdata (R) contributions, like read-only data is merged into text section.
>>>>
>>>> This implies a linker script placing rodata into an executable output section.
>>>> This is a corner case and the operation here does not matter. Just be simple.
>>>
>>> Even if it is a rare or corner case, it is still a valid case.  Counting `executableInputSections` doesn't add many cycles.  Why not just cover all the possible cases?
>>
>> For input rodata in an output executable section, since the for loop doesn't precisely track whether an input section is executable or not, it is as if all rodata are executable as well.
>> Your `executableInputSections` rule doesn't cover more cases.
>
> Input section's attribute can be precisely tracked since the linker has no reason to change input section's attribute once it is read from input OBJ.  For example, if we have source code like
>
>   __attribute__((section(".text.rox")))
>   const int x = 5;
>   
>   __attribute__((section(".text.row")))
>   const int w = 5;
>   
>   int foo() { return x; }
>   int bar() { return w; }
>
> and we compile it with `-fdata-sections -ffunction-sections` and then link the OBJ using a linker script like
>
>   SECTIONS
>   {
>     ttt 0x1000 :
>     {
>       tt.o (.text.*)
>     }
>   }
>
> Then in the `for` loop we will see two input sections are executable and the other two are not.

I have mentioned that this is a corner case we don't necessarily try to optimize. It takes code.
Note: even if some linker scripts combine rodata and code sections in one output section, such an output section is typically far smaller than `target->getThunkSectionSpacing()`
I don't think it is useful to special case `executableInputSections > 1`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128382/new/

https://reviews.llvm.org/D128382



More information about the llvm-commits mailing list