[lld] r244691 - COFF: Align sections to 512-byte boundaries on disk.
Sean Silva via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 13 00:10:17 PDT 2015
On Wed, Aug 12, 2015 at 10:59 PM, Rui Ueyama <ruiu at google.com> wrote:
> What I don't understand is that why the offset from the beginning of the
> file must be multiple of page size in order to avoid full copy. Windows
> requires all sections to be aligned at least 4K in memory and 512 bytes on
> file, and I don't see any problem there.
>
> Let's say we have two sections, A and B, whose sizes are 1024B and 4096B,
> respectively. We also assume that A's offset from the beginning of file is
> 4096, and B's 5120. The loader can map offset 4096 to 8192 of the file to
> some page, and 5120 to 9216 to other page. Why can't that?
>
>From the kernel's perspective of mapping memory (on x86), memory is divided
into aligned 4K pieces. 5120 % 4096 == 1024, so in order to map it at an
address that is 4K aligned, it must do a full memmove in order to move all
the memory by 1024 bytes so that it is 4K aligned. This image maybe helps
to understand how a 32-bit x86 CPU understands a virtual memory address:
https://upload.wikimedia.org/wikipedia/commons/8/8e/X86_Paging_4K.svg
IIRC the resources I learned from are:
http://duartes.org/gustavo/blog/post/how-the-kernel-manages-your-memory/
http://duartes.org/gustavo/blog/post/the-thing-king/
(that web page has many other very, *very* good posts. A list can be seen
at: http://duartes.org/gustavo/blog/category/internals/)
I think you will find that understanding virtual memory (and TLB) will
greatly help you optimize LLD, since many operations in LLD have very high
pressure on the virtual memory system.
-- Sean Silva
>
> On Thu, Aug 13, 2015 at 2:44 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
>>
>>
>> On Wed, Aug 12, 2015 at 7:17 PM, Rui Ueyama <ruiu at google.com> wrote:
>>
>>> I didn't test to see if this change ever has a negative impact on memory
>>> usage, but my guess is that's very unlikely because this layout is the same
>>> as what MSVC linker creates. If this is inefficient, virtually all Windows
>>> executables are being suffered by that, which is unlikely. My understanding
>>> is that the kernel maps each section separately to a memory address, so
>>> file offset of each section can be given independently from other sections.
>>>
>>
>> The mapping is done at the granularity of aligned 4K pages minimum (this
>> is just how the x86 hardware page table mechanism works). A piece of the
>> file cannot be moved by an amount that is not a multiple of 4K without a
>> full copy.
>>
>> The only way this could (in the usual case) not have a large overhead is
>> for the kernel to do a crazy hack like have special paging semantics for
>> files that are executables. This means that when LLD finishes working on a
>> memory mapped file, if a section is not 4K aligned at least, then the
>> kernel has to then do a copy to make the file conform the the actual memory
>> layout it needs to have in the paging subsystem.
>>
>> Or does windows make full copies of sections always? In other words
>> processes don't share e.g. readonly text?
>>
>> In ELF, the offset in the file and the offset in memory are required to
>> be congruent modulo the alignment (see the documentation of p_align in
>> http://www.sco.com/developers/gabi/latest/ch5.pheader.html), precisely
>> to avoid the need to do crazy hacks like this when loading the program.
>>
>> You can see that Linux will reject the binary:
>> http://lxr.free-electrons.com/source/fs/binfmt_elf.c#L664
>> (load_elf_binary)
>> -> http://lxr.free-electrons.com/source/fs/binfmt_elf.c#L336 (elf_map)
>> -> http://lxr.free-electrons.com/source/mm/util.c#L306 (vm_mmap)
>> Notice:
>> 312 if (unlikely(offset & ~PAGE_MASK))
>> 313 return -EINVAL;
>>
>> FreeBSD is more lenient, but you can see that the kernel does not like
>> the situation when this is violated:
>> http://src.illumos.org/source/xref/freebsd-head/sys/kern/imgact_elf.c#593
>> (__elfN(load_file))
>> -->
>> http://src.illumos.org/source/xref/freebsd-head/sys/kern/imgact_elf.c#467
>> (__elfN(load_section))
>> -->
>> http://src.illumos.org/source/xref/freebsd-head/sys/kern/imgact_elf.c#398
>> (__elfN(map_insert))
>> 423 /*
>> 424 * The mapping is not page aligned. This means we have
>> 425 * to copy the data. Sigh.
>> 426 */
>>
>> -- Sean Silva
>>
>>
>>>
>>> On Wed, Aug 12, 2015 at 5:31 PM, Sean Silva <chisophugis at gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Tue, Aug 11, 2015 at 4:09 PM, Rui Ueyama via llvm-commits <
>>>> llvm-commits at lists.llvm.org> wrote:
>>>>
>>>>> Author: ruiu
>>>>> Date: Tue Aug 11 18:09:00 2015
>>>>> New Revision: 244691
>>>>>
>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=244691&view=rev
>>>>> Log:
>>>>> COFF: Align sections to 512-byte boundaries on disk.
>>>>>
>>>>> Sections must start at page boundaries in memory, but they
>>>>> can be aligned to sector boundaries (512-bytes) on disk.
>>>>> We aligned them to 4096-byte boundaries even on disk, so we
>>>>> wasted disk space a bit.
>>>>>
>>>>
>>>> This will likely force the kernel to copy or otherwise do unnecessary
>>>> work when loading. Are you sure that isn't happening? The kernel ideally
>>>> wants to just create a couple page table entries. But if it needs to move
>>>> things around at <4K granularity to make them properly aligned to their
>>>> load address when loading (like this patch I think causes) then it will
>>>> need to do copies.
>>>>
>>>> This can likely be checked by looking for an increase in real memory
>>>> usage for the system when the new binaries are loaded (vs. the old
>>>> page-aligned ones), since the kernel will have a copy sitting in page cache
>>>> and a copy for alignment mapped into the process address space;
>>>> alternatively, you can check for the slowdown from the kernel copies when
>>>> faulting the memory into the process's address space (or (less likely) it
>>>> may do the copies eagerly which should be easy to measure too).
>>>>
>>>> -- Sean Silva
>>>>
>>>>
>>>>>
>>>>> Modified:
>>>>> lld/trunk/COFF/Writer.cpp
>>>>> lld/trunk/test/COFF/baserel.test
>>>>> lld/trunk/test/COFF/hello32.test
>>>>>
>>>>> Modified: lld/trunk/COFF/Writer.cpp
>>>>> URL:
>>>>> http://llvm.org/viewvc/llvm-project/lld/trunk/COFF/Writer.cpp?rev=244691&r1=244690&r2=244691&view=diff
>>>>>
>>>>> ==============================================================================
>>>>> --- lld/trunk/COFF/Writer.cpp (original)
>>>>> +++ lld/trunk/COFF/Writer.cpp Tue Aug 11 18:09:00 2015
>>>>> @@ -37,8 +37,7 @@ using namespace lld;
>>>>> using namespace lld::coff;
>>>>>
>>>>> static const int PageSize = 4096;
>>>>> -static const int FileAlignment = 512;
>>>>> -static const int SectionAlignment = 4096;
>>>>> +static const int SectorSize = 512;
>>>>> static const int DOSStubSize = 64;
>>>>> static const int NumberfOfDataDirectory = 16;
>>>>>
>>>>> @@ -174,7 +173,7 @@ void OutputSection::addChunk(Chunk *C) {
>>>>> Off += C->getSize();
>>>>> Header.VirtualSize = Off;
>>>>> if (C->hasData())
>>>>> - Header.SizeOfRawData = RoundUpToAlignment(Off, FileAlignment);
>>>>> + Header.SizeOfRawData = RoundUpToAlignment(Off, SectorSize);
>>>>> }
>>>>>
>>>>> void OutputSection::addPermissions(uint32_t C) {
>>>>> @@ -507,15 +506,14 @@ void Writer::createSymbolAndStringTable(
>>>>> // We position the symbol table to be adjacent to the end of the
>>>>> last section.
>>>>> uint64_t FileOff =
>>>>> LastSection->getFileOff() +
>>>>> - RoundUpToAlignment(LastSection->getRawSize(), FileAlignment);
>>>>> + RoundUpToAlignment(LastSection->getRawSize(), SectorSize);
>>>>> if (!OutputSymtab.empty()) {
>>>>> PointerToSymbolTable = FileOff;
>>>>> FileOff += OutputSymtab.size() * sizeof(coff_symbol16);
>>>>> }
>>>>> if (!Strtab.empty())
>>>>> FileOff += Strtab.size() + 4;
>>>>> - FileSize = SizeOfHeaders +
>>>>> - RoundUpToAlignment(FileOff - SizeOfHeaders,
>>>>> FileAlignment);
>>>>> + FileSize = RoundUpToAlignment(FileOff, SectorSize);
>>>>> }
>>>>>
>>>>> // Visits all sections to assign incremental, non-overlapping RVAs and
>>>>> @@ -526,9 +524,9 @@ void Writer::assignAddresses() {
>>>>> sizeof(coff_section) * OutputSections.size();
>>>>> SizeOfHeaders +=
>>>>> Config->is64() ? sizeof(pe32plus_header) : sizeof(pe32_header);
>>>>> - SizeOfHeaders = RoundUpToAlignment(SizeOfHeaders, PageSize);
>>>>> + SizeOfHeaders = RoundUpToAlignment(SizeOfHeaders, SectorSize);
>>>>> uint64_t RVA = 0x1000; // The first page is kept unmapped.
>>>>> - uint64_t FileOff = SizeOfHeaders;
>>>>> + FileSize = SizeOfHeaders;
>>>>> // Move DISCARDABLE (or non-memory-mapped) sections to the end of
>>>>> file because
>>>>> // the loader cannot handle holes.
>>>>> std::stable_partition(
>>>>> @@ -539,13 +537,11 @@ void Writer::assignAddresses() {
>>>>> if (Sec->getName() == ".reloc")
>>>>> addBaserels(Sec);
>>>>> Sec->setRVA(RVA);
>>>>> - Sec->setFileOffset(FileOff);
>>>>> + Sec->setFileOffset(FileSize);
>>>>> RVA += RoundUpToAlignment(Sec->getVirtualSize(), PageSize);
>>>>> - FileOff += RoundUpToAlignment(Sec->getRawSize(), FileAlignment);
>>>>> + FileSize += RoundUpToAlignment(Sec->getRawSize(), SectorSize);
>>>>> }
>>>>> SizeOfImage = SizeOfHeaders + RoundUpToAlignment(RVA - 0x1000,
>>>>> PageSize);
>>>>> - FileSize = SizeOfHeaders +
>>>>> - RoundUpToAlignment(FileOff - SizeOfHeaders,
>>>>> FileAlignment);
>>>>> }
>>>>>
>>>>> template <typename PEHeaderTy> void Writer::writeHeader() {
>>>>> @@ -584,8 +580,8 @@ template <typename PEHeaderTy> void Writ
>>>>> Buf += sizeof(*PE);
>>>>> PE->Magic = Config->is64() ? PE32Header::PE32_PLUS :
>>>>> PE32Header::PE32;
>>>>> PE->ImageBase = Config->ImageBase;
>>>>> - PE->SectionAlignment = SectionAlignment;
>>>>> - PE->FileAlignment = FileAlignment;
>>>>> + PE->SectionAlignment = PageSize;
>>>>> + PE->FileAlignment = SectorSize;
>>>>> PE->MajorImageVersion = Config->MajorImageVersion;
>>>>> PE->MinorImageVersion = Config->MinorImageVersion;
>>>>> PE->MajorOperatingSystemVersion = Config->MajorOSVersion;
>>>>>
>>>>> Modified: lld/trunk/test/COFF/baserel.test
>>>>> URL:
>>>>> http://llvm.org/viewvc/llvm-project/lld/trunk/test/COFF/baserel.test?rev=244691&r1=244690&r2=244691&view=diff
>>>>>
>>>>> ==============================================================================
>>>>> --- lld/trunk/test/COFF/baserel.test (original)
>>>>> +++ lld/trunk/test/COFF/baserel.test Tue Aug 11 18:09:00 2015
>>>>> @@ -61,7 +61,7 @@
>>>>> # BASEREL-HEADER-NEXT: VirtualSize: 0x20
>>>>> # BASEREL-HEADER-NEXT: VirtualAddress: 0x5000
>>>>> # BASEREL-HEADER-NEXT: RawDataSize: 512
>>>>> -# BASEREL-HEADER-NEXT: PointerToRawData: 0x1800
>>>>> +# BASEREL-HEADER-NEXT: PointerToRawData: 0xC00
>>>>> # BASEREL-HEADER-NEXT: PointerToRelocations: 0x0
>>>>> # BASEREL-HEADER-NEXT: PointerToLineNumbers: 0x0
>>>>> # BASEREL-HEADER-NEXT: RelocationCount: 0
>>>>>
>>>>> Modified: lld/trunk/test/COFF/hello32.test
>>>>> URL:
>>>>> http://llvm.org/viewvc/llvm-project/lld/trunk/test/COFF/hello32.test?rev=244691&r1=244690&r2=244691&view=diff
>>>>>
>>>>> ==============================================================================
>>>>> --- lld/trunk/test/COFF/hello32.test (original)
>>>>> +++ lld/trunk/test/COFF/hello32.test Tue Aug 11 18:09:00 2015
>>>>> @@ -38,8 +38,8 @@ HEADER-NEXT: MajorImageVersion: 0
>>>>> HEADER-NEXT: MinorImageVersion: 0
>>>>> HEADER-NEXT: MajorSubsystemVersion: 6
>>>>> HEADER-NEXT: MinorSubsystemVersion: 0
>>>>> -HEADER-NEXT: SizeOfImage: 20480
>>>>> -HEADER-NEXT: SizeOfHeaders: 4096
>>>>> +HEADER-NEXT: SizeOfImage: 16896
>>>>> +HEADER-NEXT: SizeOfHeaders: 512
>>>>> HEADER-NEXT: Subsystem: IMAGE_SUBSYSTEM_WINDOWS_CUI (0x3)
>>>>> HEADER-NEXT: Characteristics [ (0x8140)
>>>>> HEADER-NEXT: IMAGE_DLL_CHARACTERISTICS_DYNAMIC_BASE (0x40)
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>>
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150813/be4ca62b/attachment.html>
More information about the llvm-commits
mailing list