[PATCH] D69157: [X86] Prefer KORTEST on Knights Landing or later for memcmp()

Thu Oct 24 04:29:50 PDT 2019

davezarzycki marked 2 inline comments as done.
davezarzycki added a comment.

Thanks for getting back to me. This isn't urgent so please enjoy the conference!

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:42629

+    auto ScalarToVector = [&](SDValue X) -> SDValue {
+      X = DAG.getBitcast(CastVT, X);
----------------
craig.topper wrote:
> This isn't ScalarToVector. It's Vector to wider Vector right?
The memcmp expansion creates large scalar values and that are normally bitcast to a vector with this closure. In the case of Xeon Phi, it may also widen the vector too. If you have a better name for the closure, I'll happily rename it.

================
Comment at: lib/Target/X86/X86TargetTransformInfo.h:88
       X86::FeaturePrefer256Bit,
+      X86::FeaturePreferMaskRegisters,

----------------
craig.topper wrote:
> I think this should be with the CodeGen control options. The FeaturePrefer128Bit/256Bit were special because they are properties of the CPUs and they can be implied by a function attribute.
> 
> I can't explain why SlowUAMem32 and SlowUAMem16 are in separate sections....
Interesting. I was seriously considering naming this "SlowPTESTAndMOVMSK" to be consistent with the "Fast" and "Slow" pattern for fast/slow instructions. I can make this a CodeGen control option if you want, but please help me understand why slow PTEST/MOVMSK instructions (a.k.a. "prefer mask registers") is different than the other slow feature flags. Thanks!

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69157/new/

https://reviews.llvm.org/D69157