[PATCH] D71786: RFC: [Support] On Windows, add optional support for rpmalloc

Fangrui Song via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 30 17:23:57 PDT 2020


MaskRay added a comment.

In D71786#2124154 <https://reviews.llvm.org/D71786#2124154>, @aganea wrote:

> In D71786#2123929 <https://reviews.llvm.org/D71786#2123929>, @MaskRay wrote:
>
> > Second, adding a malloc implementation does not seem suitable to me. I think this particular comment should be addressed https://reviews.llvm.org/D71786#1984285
>
>
> Sorry, which part of that comment should be addressed Fangrui?


Hi, I think you answered in the next paragraph.

>> Is replacing the malloc implementation on difficult on Windows that the compiler has to ship a copy?
> 
> Windows doesn't have a native LD_PRELOAD mechanism. The only other way would be to use Detours <https://github.com/microsoft/Detours> or do runtime patching of the EXE, however @maniccoder seems to advise against it: https://reviews.llvm.org/D71786#1812576 - I've never used Detours personally, maybe others can relate. That also means using /MD (dynamic CRT) instead of /MT (static CRT), which makes the executable slower //de facto//, because of the DLL thunking.
> 
> However that leaves the official LLVM releases & git checkouts with using the default Windows heap, which doesn't scale. The execution time of any heavily-allocating threaded workload like ThinLTO, is inversely proportional to the number of cores. The more people will have 32- or 64-core CPUs, the more this will become apparent. Microsoft's own compiler uses tbbmalloc since VS2019.

Thanks for the explanation. Lacking of an out-of-box LD_PRELOAD sound sounds readlly unfortunate ;-)

> We are only suggesting this improvement for Windows. Other LLVM platforms could remain as they are. Perhaps this needs a RFC on the mailing list, but the gains are pretty clear, we're talking of at least an order of magnitude in ThinLTO execution time on many-core machines.

I can see if makes a lot of improvement for Windows. This patch pulls in a third party project with more than 3000 lines of code. I guess a RFC is probably deserved.

Finally, let me end with a hopefully constructive argument:) If the malloc interface is well defined on Windows, can -fintegrated-crt-alloc be generalized to take a value - path to the precompiled malloc implementation, so that we don't need to check in a specific implementation & users can easily use whatever implementations they want?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71786/new/

https://reviews.llvm.org/D71786





More information about the llvm-commits mailing list