[PATCH] D18566: [x86] use SSE/AVX ops for non-zero memsets (PR27100)

Tue Mar 29 16:26:23 PDT 2016

spatel updated this revision to Diff 52003.
spatel added a comment.

Patch updated:
Move the memset check down to the slow SSE case: this allows fast targets to take advantage of SSE/AVX instructions and prevents slow targets from stepping into a codegen sinkhole while trying to splat a byte into an XMM reg.

Note that, unlike the previous rev of the patch, all existing regression tests remain unchanged except for the tests that I added to model the request in PR27100.

We still have the questions of AVX1 codegen and unexpected machine scheduler behavior, but I think we can address those separately.

http://reviews.llvm.org/D18566

Files:
  lib/Target/X86/X86ISelLowering.cpp
  test/CodeGen/X86/memset-nonzero.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D18566.52003.patch
Type: text/x-patch
Size: 9467 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160329/49dfc1b5/attachment.bin>