[PATCH] D52392: [X86] For lzcnt/tzcnt intrinsics use cttz/ctlz intrinsics with zero_undef flag set to false.
Craig Topper via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Sep 22 10:18:46 PDT 2018
craig.topper created this revision.
craig.topper added reviewers: RKSimon, spatel.
Herald added a subscriber: cfe-commits.
Previously we used a select and the zero_undef=true intrinsic. In -O2 this pattern will get optimized to zero_undef=false. But in -O0 this optimization won't happen. This results in a compare and cmov being wrapped around a tzcnt/lzcnt instruction.
By using the zero_undef=false intrinsic directly without the select, we can improve the -O0 codegen to just an lzcnt/tzcnt instruction.
Repository:
rC Clang
https://reviews.llvm.org/D52392
Files:
include/clang/Basic/BuiltinsX86.def
include/clang/Basic/BuiltinsX86_64.def
lib/CodeGen/CGBuiltin.cpp
lib/Headers/bmiintrin.h
lib/Headers/lzcntintrin.h
test/CodeGen/bmi-builtins.c
test/CodeGen/lzcnt-builtins.c
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D52392.166606.patch
Type: text/x-patch
Size: 8782 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180922/99db1f18/attachment.bin>
More information about the llvm-commits
mailing list