Benchmarking file output strategies

Sean Silva chisophugis at gmail.com
Mon Dec 15 19:20:12 PST 2014


Michael and I just did some experiments on his machine. It looks like
Windows is doing a huge amount of IO *after* the program exits (both mmap
and write). Which reminds me: in your OP, the windows 7 vm mmap version
(1.6s) is faster than the native Mac HFS+ version; those were on the same
machine, right? If so, then it is weird that the vm was outperforming the
native OS, so something like this IO-after-program-exit is probably at work.

We did find on Michael's machine (Win 8) that the mmap version was
generally slower, roughly similar to what I was seeing on my Mac (although
on the Mac the data was being committed to disk before the program exited).

-- Sean Silva

On Mon, Dec 15, 2014 at 4:07 PM, Rafael EspĂ­ndola <
rafael.espindola at gmail.com> wrote:
>
> > One more thing: on my particular machine (new Mac Pro), Rafael's test
> > program is actually CPU-bottlenecked; the new Mac Pro's have insanely
> fast
> > SSD's connected over PCI-e. Just doing the CPU work of generating the
> random
> > numbers (1GiB version) takes 1.6s, which is basically the same time that
> the
> > write version takes; even just generating all 1GiB of random numbers in
> > place (no large memory allocation involved, no file creation involved)
> takes
> > 1.1s. Just writing 1GiB to disk sequentially takes 1.0s.
> >
> > Also, it takes about .45 seconds to just memset a 1GiB malloc'd region
> with
> > 0's; most of this is virtual memory overhead, since if you reuse the same
> > 1GiB region, then after the first run it takes <0.1s.
>
> Interesting!
>
> I tried writing 0xabcdabcdabcdabcd instead of random numbers. The most
> interesting cases are probably
>
> tmpfs:
> 4.930983e-01
> 4.236851e-01
>
> The time difference dropped from 0.37s to just 0.069s.  Using less cpu
> for the number generation is benefitting the mmap run more than the
> write one.
>
> On windows mmap is still faster by 3 to 5x depending on the run (with
> mmap taking about 1s).  Using FileWrite instead of write helps, but
> not by much.
>
> I guess the somewhat reasonable findings so far are
>
> * On windows mmap is faster.
> * On posix, write can be faster, but by how much depends on what else
> the program is doing.
>
> OK. That is sufficient to convince me to just try porting llvm-ar to
> FileOutputBuffer and see what the impact is. Someone really motivated
> might want to check if lld would get any faster with writes on posix
> systems (but keep using mmap on windows).
>
> Cheers,
> Rafael
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141215/af640c4e/attachment.html>


More information about the llvm-commits mailing list