Benchmarking file output strategies

Sean Silva chisophugis at gmail.com
Mon Dec 15 14:12:08 PST 2014


On Sat, Dec 13, 2014 at 2:03 PM, Reid Kleckner <rnk at google.com> wrote:
>
> My best theory is that mmap has to map zero pages first. Typically
> everything is mapped to one zero page, and then every new page write causes
> a COW page fault. That might explain Sean's observations of lots of
> functions that execute #bytes/4096 times.
>
> I wonder if there are some tweaks to the mmap path that would help, like
> only mapping 10MB of output buffer per file and then remapping it to a
> +10MB offset. There are probably also things like MAP_POPULATE or
> explicitly asking for a few huge pages that might help.
>
>
>
One more thing: on my particular machine (new Mac Pro), Rafael's test
program is actually CPU-bottlenecked; the new Mac Pro's have insanely fast
SSD's connected over PCI-e. Just doing the CPU work of generating the
random numbers (1GiB version) takes 1.6s, which is basically the same time
that the write version takes; even just generating all 1GiB of random
numbers in place (no large memory allocation involved, no file creation
involved) takes 1.1s. Just writing 1GiB to disk sequentially takes 1.0s.

Also, it takes about .45 seconds to just memset a 1GiB malloc'd region with
0's; most of this is virtual memory overhead, since if you reuse the same
1GiB region, then after the first run it takes <0.1s.

-- Sean Silva


>
> On Fri, Dec 12, 2014 at 3:30 PM, Rafael Ávila de Espíndola <
> rafael.espindola at gmail.com> wrote:
>>
>> It seems that the common wisdom on the fastest way to create a file is
>>
>> * create the file
>> * resize it to the final size
>> * mmap it rw
>> * write the data to the mapping
>>
>> I benchmarked that against doing 1 MB writes to create a 1GB file with
>> pseudo random data.
>>
>> The test program is attached. The results I got were (in seconds, mmap is
>> the first):
>>
>> btrfs
>> 1.752698e+00
>> 1.112864e+00
>>
>> tmpsfs
>> 1.484731e+00
>> 1.113772e+00
>>
>> hfs+ (laptop)
>> 4.015817e+00
>> 2.240137e+00
>>
>> windows 7 (vm)
>> 1.609375e+00
>> 3.875000e+00
>>
>> ext2 on arm (old google chrome book):
>> 5.910171e+01
>> 6.566929e+01
>>
>> So on Windows it is true, mmap seems to be faster than writes. On Linux
>> and OS X x86_64 the situation is inverted. On arm mmap is a bit faster.
>>
>> It would be interesting to see if someone else can reproduce these
>> numbers. It would be particularly nice to try a newer arm system and
>> windows outside a vm.
>>
>> Also, does anyone have a theory of where the difference comes from?
>>
>> Cheers,
>> Rafael
>>
>>
>>
>>
>> Sent from my iPhone
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141215/431d3999/attachment.html>


More information about the llvm-commits mailing list