[PATCH] D15258: [PGO] Remove data races on Data->Values field

David Li via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 17 17:40:27 PST 2015


davidxl added a comment.

Regarding performance, fwrite actually buffers the data so the number of system calls to write is actually much fewer than the number of Data entry - the overall time is dominated by IO, not system calls.

Here is the design of the stress testing:

1. total number of value data entries to be written out : 3 million
2. total size of the value data 1.6G -- this is way larger than an average program can produce -- for instance clang's profile data raw size is about 100M

Test machine is a sandybridge machine.

Results:
a) write out VP data one by one without batching

- total number of calls to fwrite: 3M,
- total number of calls to write: ~390K.
- real time: ~12s; sys time: ~2.5s

b) write out VP data in batches -- batch size is 1024 (i.e, copy 1024 VP data into a buffer and write out)

- total number of calls to fwrite: 3K
- total number of calls to write:  ~6K (yes, it is more than calls to fwrite -- large a very large write can be split into smaller chunks).
- real time: ~12s, sys time: ~2s

The savings from reduced number of sys calls is not much.

In another experiment, /dev/null is used as the output to remove IO.

a) for non batch case

- real time: ~1.6s  (average of 10 runs)
- sys time: ~0.85s (average of 10 runs)

b) with batch:

- real time: ~1.7s (average 10 runs)
- sys time: ~0.73s (average of 10 runs).

Note that b has addition cost of memory copy

Based on the above data, we can probably go with non-batch for simplicity for now. The batch write can be easily added later.


http://reviews.llvm.org/D15258





More information about the llvm-commits mailing list