[PATCH] D52392: [X86] For lzcnt/tzcnt intrinsics use cttz/ctlz intrinsics with zero_undef flag set to false.

Sat Sep 22 10:18:46 PDT 2018

craig.topper created this revision.
craig.topper added reviewers: RKSimon, spatel.
Herald added a subscriber: cfe-commits.

Previously we used a select and the zero_undef=true intrinsic. In -O2 this pattern will get optimized to zero_undef=false. But in -O0 this optimization won't happen. This results in a compare and cmov being wrapped around a tzcnt/lzcnt instruction.

By using the zero_undef=false intrinsic directly without the select, we can improve the -O0 codegen to just an lzcnt/tzcnt instruction.

Repository:
  rC Clang

https://reviews.llvm.org/D52392

Files:
  include/clang/Basic/BuiltinsX86.def
  include/clang/Basic/BuiltinsX86_64.def
  lib/CodeGen/CGBuiltin.cpp
  lib/Headers/bmiintrin.h
  lib/Headers/lzcntintrin.h
  test/CodeGen/bmi-builtins.c
  test/CodeGen/lzcnt-builtins.c

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D52392.166606.patch
Type: text/x-patch
Size: 8782 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20180922/99db1f18/attachment.bin>