[PATCH] D69157: [X86] Prefer KORTEST on Knights Landing or later for memcmp()

Thu Oct 24 10:56:57 PDT 2019

craig.topper added inline comments.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:42629

+    auto ScalarToVector = [&](SDValue X) -> SDValue {
+      X = DAG.getBitcast(CastVT, X);
----------------
davezarzycki wrote:
> craig.topper wrote:
> > This isn't ScalarToVector. It's Vector to wider Vector right?
> The memcmp expansion creates large scalar values and that are normally bitcast to a vector with this closure. In the case of Xeon Phi, it may also widen the vector too. If you have a better name for the closure, I'll happily rename it.
I missed that the bitcast is in here too. So it is scalar to vector. Sorry about that.

================
Comment at: lib/Target/X86/X86TargetTransformInfo.h:88
       X86::FeaturePrefer256Bit,
+      X86::FeaturePreferMaskRegisters,

----------------
davezarzycki wrote:
> craig.topper wrote:
> > I think this should be with the CodeGen control options. The FeaturePrefer128Bit/256Bit were special because they are properties of the CPUs and they can be implied by a function attribute.
> > 
> > I can't explain why SlowUAMem32 and SlowUAMem16 are in separate sections....
> Interesting. I was seriously considering naming this "SlowPTESTAndMOVMSK" to be consistent with the "Fast" and "Slow" pattern for fast/slow instructions. I can make this a CodeGen control option if you want, but please help me understand why slow PTEST/MOVMSK instructions (a.k.a. "prefer mask registers") is different than the other slow feature flags. Thanks!
Its not different that's why I wanted it grouped with the Fast/Slow flags.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69157/new/

https://reviews.llvm.org/D69157