[PATCH] D18676: [x86] avoid intermediate splat for non-zero memsets (PR27100)

Simon Pilgrim via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 1 04:56:41 PDT 2016


RKSimon added inline comments.

================
Comment at: test/CodeGen/X86/memset-nonzero.ll:94
@@ +93,3 @@
+; AVX-LABEL: memset_128_nonzero_bytes:
+; AVX:         vmovaps {{.*#+}} ymm0 = [42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42]
+; AVX-NEXT:    vmovups %ymm0, 96(%rdi)
----------------
andreadb wrote:
> I noticed that on AVX we now always generate a vmovaps to load a vector of constants.
> That's obviously fine. However, I wonder if a vbroadcastss would be more appropriate in this case as it would use a smaller constant (for code size only - in this example we would save 28 bytes).
This is what is being discussed on PR27141 - its proving tricky to determine when the broadcast is worth it and when it will cause register pressure issues.


http://reviews.llvm.org/D18676





More information about the llvm-commits mailing list