[PATCH] [lld][Core] Implement parallel_for_each

Sun Mar 15 18:42:40 PDT 2015

When we are optimizing something, we generally don't care about one particular result in many results. Picking up the best number hardly describe a change you made and also not scientific thing to do because observation always include noise (you could even observe an "improvement" without making any change if you pick up the best number among multiple runs). Please keep that in mind when you do optimization.

As to the patch, you split an array into small pieces whose size is 512 each. If we have, say, 1 million items, we create 2000 tasks. We of course cannot execute 2000 tasks in parallel -- the maximum parallelism is equal to the number of cores, which is std::thread::hardware_concurrency(). What if you divide tasks by hardware_concurrency? Each task becomes larger (size of an array divided by number of cores), so it needs less synchronization, which is good. But if task load is not evenly distributed, we need to wait for slow workers, which might have negative impact. This may worth testing.

http://reviews.llvm.org/D8348

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/