[PATCH] D18676: [x86] avoid intermediate splat for non-zero memsets (PR27100)

Sanjay Patel via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 31 14:22:26 PDT 2016


spatel created this revision.
spatel added reviewers: zansari, RKSimon, hjl.tools.
spatel added a subscriber: llvm-commits.
Herald added a subscriber: mcrosier.

Follow-up to D18566 - where we noticed that an intermediate splat was being generated for memsets of non-zero chars.

That was because we told getMemsetStores() to use a 32-bit vector element type, and it happily obliged by producing that constant using an integer multiply.

The tests that were added in the last patch are now equivalent for AVX1 and AVX2 (no splats, just a vector load), but we have PR27141 to track that splat difference. In the new tests, the splat via shuffling looks ok to me, but there might be some room for improvement depending on uarch there.

Note that I didn't change the SSE1/2 paths in this patch. I will follow-up with that patch next. This patch should resolve PR27100.

http://reviews.llvm.org/D18676

Files:
  lib/Target/X86/X86ISelLowering.cpp
  test/CodeGen/X86/memset-nonzero.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D18676.52287.patch
Type: text/x-patch
Size: 11153 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160331/c8acab6d/attachment.bin>


More information about the llvm-commits mailing list