[lld] [llvm] [LLD][COFF] Prefetch inputs early-on to improve link times (PR #169224)

Fri Nov 28 07:29:50 PST 2025

aganea wrote:

> I wonder what this does in more low-RAM situations though. Will it increase swapping, or is the OS smart enough to hold back a little on the prefetching then?

[The docs ](https://learn.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-prefetchvirtualmemory)are stating that it only brings pages from the storage into physical memory "_it is treated as a strong hint by the system and is subject to usual physical memory constraints where it can completely or partially fail under low-memory conditions_". The pages aren't even mapped in the working set, this only comes later when the process actually tries to access them, which will create a soft page fault (vs. a hard fault when the page is not in physical memory and has to be brough back from the storage).

> We've had reports (https://crbug.com/428641952) that lld is using too much RAM when linking PDBs, and this sounds like it could potentially make that worse.

This is a quite an interesting and challenging problem. However I don't think it would make it worse as stated above; the prefetch hint will simply be ignored. I compared before/after this PR in low memory conditions and surpsingly this PR improves the link times:
```
Summary
  with_pr\lld-link.exe @Game.exe.rsp ran
    1.10 ± 0.09 times faster than before\lld-link.exe @Game.exe.rsp
```
I can ask perhaps on that ticket if anyone on Edge could test this PR.

> (Related to that bug, I've been wondering if we could do more of the opposite, and call `mapped_file_region::dontNeed()` when we're done writing out an object file's code and debug symbols to the final outputs.)

Yeah I saw your comment on the ticket. This is quite challenging since in many places in LLD/COFF we're not really copying buffers from the input mmaped buffers, rather taking references to them. Until the PDB is not fully written to the mmap buffer, it is hard to know what to unload. This would require a dependency assessement of the code, to see if such thing is feaseable. But at a first glance it wouldn't be simple. However, we could do that for bitcode modules, when ThinLTO is done with a specific input file. Not sure if that's the case Edge are hitting or not.

Another opportunity to solving the issue raised in the ticket would be to write a memory tracking system in LLVM, and dump the allocated list into a .json file at specific places during the LLD execution. Similar to `--time-trace` but for memory.

https://github.com/llvm/llvm-project/pull/169224