[PATCH] D69157: [X86] Prefer KORTEST on Knights Landing or later for memcmp()
David Zarzycki via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 24 04:29:50 PDT 2019
davezarzycki marked 2 inline comments as done.
davezarzycki added a comment.
Thanks for getting back to me. This isn't urgent so please enjoy the conference!
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:42629
+ auto ScalarToVector = [&](SDValue X) -> SDValue {
+ X = DAG.getBitcast(CastVT, X);
----------------
craig.topper wrote:
> This isn't ScalarToVector. It's Vector to wider Vector right?
The memcmp expansion creates large scalar values and that are normally bitcast to a vector with this closure. In the case of Xeon Phi, it may also widen the vector too. If you have a better name for the closure, I'll happily rename it.
================
Comment at: lib/Target/X86/X86TargetTransformInfo.h:88
X86::FeaturePrefer256Bit,
+ X86::FeaturePreferMaskRegisters,
----------------
craig.topper wrote:
> I think this should be with the CodeGen control options. The FeaturePrefer128Bit/256Bit were special because they are properties of the CPUs and they can be implied by a function attribute.
>
> I can't explain why SlowUAMem32 and SlowUAMem16 are in separate sections....
Interesting. I was seriously considering naming this "SlowPTESTAndMOVMSK" to be consistent with the "Fast" and "Slow" pattern for fast/slow instructions. I can make this a CodeGen control option if you want, but please help me understand why slow PTEST/MOVMSK instructions (a.k.a. "prefer mask registers") is different than the other slow feature flags. Thanks!
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D69157/new/
https://reviews.llvm.org/D69157
More information about the llvm-commits
mailing list