[PATCH] [lld][Core] Implement parallel_for_each

Mon Mar 16 12:21:01 PDT 2015

On 3/16/2015 2:07 PM, Davide Italiano wrote:
> On Mon, Mar 16, 2015 at 12:04 PM, Davide Italiano <davide at freebsd.org> wrote:
>> On Mon, Mar 16, 2015 at 11:50 AM, Shankar Easwaran
>> <shankare at codeaurora.org> wrote:
>>> Agree, we need to debug this, but I am unaware of tools that can show how
>>> threads are being scheduled ? Any pointers ?
>>>
>> As I mentioned in the LLD performance thread I'm aware of Vtune that
>> does that (if you have a license for it). It has also a very nice
>> 'lock analysis' feature that might help in this case.
>> That said, I vote for getting the very first version you proposed in
>> the tree and refine things later, unless there's strong opposition. No
>> particular benefits of regression were found, but the patch I built on
>> top of it (linked in the other thread) showed up a significant
>> benchmarks for my use case.
>>
> This should be read as "showed a significant speedup", sorry.
> Also, I'm definitely in favour of having an iterative rather than
> recursive version (if others agree and you can build an iterative
> version of the very first patch I will try getting some #s later
> today).
>
It would be trivial to change the non recursive implementation to match 
the first implementation. I will post a patch soon.

Shankar Easwaran

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation