[cfe-dev] [llvm-dev] RFC: Replacing the default CRT allocator on Windows

Zachary Turner via cfe-dev cfe-dev at lists.llvm.org
Tue Jul 7 10:25:23 PDT 2020


Note that ASAN support is present on Windows now.  Does the Debug CRT
provide any features that are not better served by ASAN?

On Tue, Jul 7, 2020 at 9:44 AM Chris Tetreault via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> For release builds, I think this is fine. However for debug builds, the
> Windows allocator provides a lot of built-in functionality for debugging
> memory issues that I would be very sad to lose. Therefore, I would request
> that:
>
>
>
>    1. This be added as a configuration option to either select the new
>    allocator or the windows allocator
>    2. The Windows allocator be used by default in debug builds
>
>
>
> Ideally, since you’re doing this work, you’d implement it in such a way
> that it’s fairly easy for anybody to use whatever allocator they want when
> building LLVM (on any platform, not just windows), and it’s not just
> hardcoded to system allocator vs whatever allocator ends up getting added.
> However, as long as I can use the windows debug allocator I’m happy.
>
>
>
> Thanks,
>
>    Christopher Tetreault
>
>
>
> *From:* cfe-dev <cfe-dev-bounces at lists.llvm.org> *On Behalf Of *Alexandre
> Ganea via cfe-dev
> *Sent:* Wednesday, July 1, 2020 9:20 PM
> *To:* cfe-dev at lists.llvm.org; LLVM Dev <llvm-dev at lists.llvm.org>
> *Subject:* [EXT] [cfe-dev] RFC: Replacing the default CRT allocator on
> Windows
>
>
>
> Hello,
>
>
>
> I was wondering how folks were feeling about replacing the default Windows
> CRT allocator in Clang, LLD and other LLVM tools possibly.
>
>
>
> The CRT heap allocator on Windows doesn’t scale well on large core count
> machines. Any multi-threaded workload in LLVM that allocates often is
> impacted by this. As a result, link times with ThinLTO are extremely slow
> on Windows. We’re observing performance inversely proportional to the
> number of cores. The more cores the machines has, the slower ThinLTO
> linking gets.
>
>
>
> We’ve replaced the CRT heap allocator by modern lock-free thread-cache
> allocators such as rpmalloc (unlicence), mimalloc (MIT licence) or snmalloc
> (MIT licence). The runtime performance is an order of magnitude faster.
>
>
>
> Time to link clang.exe with LLD and -flto on 36-core:
>
>   Windows CRT heap allocator: 38 min 47 sec
>
>   mimalloc: 2 min 22 sec
>
>   rpmalloc: 2 min 15 sec
>
>   snmalloc: 2 min 19 sec
>
>
>
> We’re running in production with a downstream fork of LLVM + rpmalloc for
> more than a year. However when cross-compiling some specific game platforms
> we’re using other downstream forks of LLVM that we can’t change.
>
>
>
> Two questions arise:
>
>    1. The licencing. Should we embed one of these allocators into the
>    LLVM tree, or keep them separate out-of-the-tree?
>    2. If the answer for above question is “yes”, given the tremendous
>    performance speedup, should we embed one of these allocators into Clang/LLD
>    builds by default? (on Windows only) Considering that Windows doesn’t have
>    a LD_PRELOAD mechanism.
>
>
>
> Please see demo patch here: https://reviews.llvm.org/D71786
>
>
>
> Thank you in advance for the feedback!
>
> Alex.
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200707/c94c3a1e/attachment.html>


More information about the cfe-dev mailing list