[PATCH] D39232: [CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2).

Mon Oct 30 03:12:10 PDT 2017

courbet added inline comments.

================
Comment at: lib/Target/X86/X86TargetTransformInfo.cpp:2556-2557
+    if (ST->hasAVX512()) Options.LoadSizes.push_back(64);
+    if (ST->hasAVX()) Options.LoadSizes.push_back(32);
+    if (ST->hasSSE1()) Options.LoadSizes.push_back(16);
+    if (ST->is64Bit()) {
----------------
spatel wrote:
> This isn't correct (or at least it doesn't match what the DAG handles optimally). I've added extra runs to memcmp.ll, so we can see what happens for SSE1/AVX1 vs. SSE2/AVX2.
Right. I was basing this on getRegisterBitWidth), but now I see that the DAG can do something else for the sake of performance. So I switched to 16B for >=SSE2 and 32B for >= AVX2.
Thanks for the tests.

https://reviews.llvm.org/D39232