[PATCH] D32481: [X86] memset should be using REPSTOS for memset on recent CPU (that have ERMS).

Clement Courbet via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 25 02:02:24 PDT 2017


courbet created this revision.

This dramatically improves memset for aligned buffers (no changes for
unaligned buffers).

For example: On Haswell, throughput is roughly doubled and nearly maxes out the
bandwidth (30 B/cycle instead of 15 B/cycle before this change, with a max
bandwidth of 32 B/cycle).

See the graph here:
https://docs.google.com/spreadsheets/d/1bbT5Oqj3e5SFNh_5oKpwghEQuLazHI95E0-htGrADZ4/pubchart?oid=1858075526&format=interactive


https://reviews.llvm.org/D32481

Files:
  lib/Target/X86/X86SelectionDAGInfo.cpp
  test/CodeGen/X86/memset-large.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D32481.96524.patch
Type: text/x-patch
Size: 12353 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170425/3ed0e4d4/attachment.bin>


More information about the llvm-commits mailing list