[LLVMdev] multithreaded performance disaster with -fprofile-instr-generate (contention on profile counters)

Dmitry Vyukov dvyukov at google.com
Fri Apr 18 02:53:18 PDT 2014


On Fri, Apr 18, 2014 at 1:41 PM, Chandler Carruth <chandlerc at google.com>wrote:

>
> On Fri, Apr 18, 2014 at 2:30 AM, Dmitry Vyukov <dvyukov at google.com> wrote:
>
>> It's not at all clear to me that this scales up (either in memory usage,
>>> memory reservation, or shutdown time) to larger applications. Chrome isn't
>>> a useful upper bound here.
>>>
>>
>> Array processing is fast. Contention is slow. I would expect this to be a
>> net win.
>> For the additional memory consumption during final merge, we can process
>> one per-thread array, unmap it, process second array, unmap it, and so on.
>> This will not require bringing all the pages into memory.
>>
>
> Array processing is fast, but paging in a large % of the pages in your
> address space is not at all fast. This will murder the kernel's page table,
> and do other very slow things I suspect.
>


If the pages were already written to, then the pages are already mapped. If
the pages were not written to, and you are only *reading* them, then kernel
premaps a single zero page for them. It's virtually zero cost for both time
and memory.
I do not foresee any issues here.
So on second thought my previous statement that we need to unmap processed
arrays is wrong, there is no need to unmap them.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140418/00372baa/attachment.html>


More information about the llvm-dev mailing list