[PATCH] D15258: [PGO] Remove data races on Data->Values field
David Li via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 17 17:40:27 PST 2015
davidxl added a comment.
Regarding performance, fwrite actually buffers the data so the number of system calls to write is actually much fewer than the number of Data entry - the overall time is dominated by IO, not system calls.
Here is the design of the stress testing:
1. total number of value data entries to be written out : 3 million
2. total size of the value data 1.6G -- this is way larger than an average program can produce -- for instance clang's profile data raw size is about 100M
Test machine is a sandybridge machine.
Results:
a) write out VP data one by one without batching
- total number of calls to fwrite: 3M,
- total number of calls to write: ~390K.
- real time: ~12s; sys time: ~2.5s
b) write out VP data in batches -- batch size is 1024 (i.e, copy 1024 VP data into a buffer and write out)
- total number of calls to fwrite: 3K
- total number of calls to write: ~6K (yes, it is more than calls to fwrite -- large a very large write can be split into smaller chunks).
- real time: ~12s, sys time: ~2s
The savings from reduced number of sys calls is not much.
In another experiment, /dev/null is used as the output to remove IO.
a) for non batch case
- real time: ~1.6s (average of 10 runs)
- sys time: ~0.85s (average of 10 runs)
b) with batch:
- real time: ~1.7s (average 10 runs)
- sys time: ~0.73s (average of 10 runs).
Note that b has addition cost of memory copy
Based on the above data, we can probably go with non-batch for simplicity for now. The batch write can be easily added later.
http://reviews.llvm.org/D15258
More information about the llvm-commits
mailing list