[PATCH] D71786: RFC: [Support] On Windows, add optional support for rpmalloc

Fri Jan 10 06:46:12 PST 2020

riccibruno added a comment.

In D71786#1813866 <https://reviews.llvm.org/D71786#1813866>, @russell.gallop wrote:

> In D71786#1812388 <https://reviews.llvm.org/D71786#1812388>, @riccibruno wrote:
>
> > Am I the only one who is a little bit uncomfortable with including a replacement to the system allocator in LLVM?
>
>
> No, you're not. I don't think it's ideal but I'm not sure what the alternative is to remove this bottleneck for ThinLTO on Windows.
>
> It might be possible to use Win32 API (https://docs.microsoft.com/en-gb/windows/win32/api/heapapi/) HeapCreate() to create a heap for each thread and HeapAlloc() with HEAP_NO_SERIALIZE but HeapCreate() heaps are finite so you'd have to handle exhausting that and by then you're practically writing a new memory allocator...

Are they? I only looked quickly at the reference for `HeapCreate`, but I see: `If dwMaximumSize is 0, the heap can grow in size. The heap's size is limited only by the available memory. [...]` However this does not help if the allocations are all over the place since you have to explicitly use `HeapAlloc`.

> It does feel strange that we have to replace the default allocator to get efficient memory allocation on multiple threads, which doesn't sound like an uncommon requirement to me.
> 
> Including the entire rpmalloc source does add quite a bit of source code in one go. Would it be better to keep the rpmalloc source outside LLVM and just include hooks?
> 
> With all respect to @maniccoder, there are other replacement memory allocators that could be considered as alternatives (e.g.):
>  tcmalloc from https://github.com/gperftools/gperftools (as @rnk mentioned above)
>  https://github.com/microsoft/mimalloc
> 
>> It seems to me that this is something that is clearly under the responsibility of the C library. Now this might still be worth doing given the large performance improvement, but yuck...
> 
> I wouldn't be happy with unconditionally replacing the C library malloc (without a cmake option). LLVM should certainly continue to be tested, and work, with the default allocator as we wouldn't want to unintentionally develop a dependence on an alternative.
> 
> I guess we can hope that the standard implementation improves with increased use of multi-threaded code. glibc for instance added a thread cache to malloc in 2017: https://sourceware.org/ml/libc-alpha/2017-08/msg00010.html

Yes, really it does not seem like an uncommon usage pattern. In the meanwhile I don't have a better idea here and I don't want to block this if this is what is needed.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71786/new/

https://reviews.llvm.org/D71786