[LLVMdev] multithreaded performance disaster with -fprofile-instr-generate (contention on profile counters)

Xinliang David Li xinliangli at gmail.com
Thu Apr 17 11:33:35 PDT 2014


Yes -- option. It would be unwise to penalize single threaded app
unconditionally.

David


On Thu, Apr 17, 2014 at 11:22 AM, Bob Wilson <bob.wilson at apple.com> wrote:

>
> On Apr 17, 2014, at 11:09 AM, Xinliang David Li <xinliangli at gmail.com>
> wrote:
>
>
> On Thu, Apr 17, 2014 at 10:58 AM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
>
>>
>> On 2014-Apr-17, at 10:38, Xinliang David Li <xinliangli at gmail.com> wrote:
>>
>> >
>> > Another idea is to use stack local counters per function -- synced up
>> with global counters on entry and exit. the problem with it is for deeply
>> recursive calls, stack pressure can be too high.
>>
>> I think they'd need to be synced with global counters before function
>> calls as well, since any function call can call "exit()".
>>
>
> right -- but it might be better to handle this in other ways. For instance
> a stack of counters for each frames is maintained. At exit, they are
> flushed in a batch. Or simply ignore it in case of program exit .
>
>
> It seems to me like we’re going to have a hard time getting good
> multithreaded performance without significant impact on the single-threaded
> behavior. We might need to add an option to choose between those. There’s a
> lot of room for improvement in the performance with the current
> instrumentation, so maybe we can find a way to make things incrementally
> better in a way that helps both, but avoiding the multithreaded cache
> conflicts seems like it’s going to be expensive in other ways.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140417/a61d9108/attachment.html>


More information about the llvm-dev mailing list