[PATCH] [CodeGenPrepare] Teach when it is profitable to speculate calls to @llvm.cttz/ctlz.

Chandler Carruth chandlerc at google.com
Tue Dec 23 17:27:58 PST 2014


On Thu, Dec 18, 2014 at 12:42 PM, Andrea Di Biagio <
Andrea_DiBiagio at sn.scee.net> wrote:

> The constraints are:
> a) The 'then' basic block is taken only if the input operand to the
> cttz/ctlz is different than zero;
> b) The phi node propagates the size-of (in bits) of the value %val in
> input to the cttz/ctlz if %val is zero.
> c) The target says that it is "cheap" to speculate cttz/ctlz.
>
> If all these constraints are met, CodeGenPrepare can hoist the call to
> cttz/ctlz from the 'then' basic block into the 'entry' basic block. The new
> cttz/ctlz instruction will also have the 'undef on zero' flag set to
> 'false'.
>
> I added two new hooks in TargetLowering.h to let targets customize the
> behavior (i.e. decide whether it is cheap or not to speculate calls to
> cttz/ctlz). The two new methods are 'isCheapToSpeculateCtlz' and
> 'isCheapToSpeculateCttz'.
> By default, both methods return 'false'. Which means, CodeGenPrepare
> doesn't try to speculate calls to cttz/ctlz unless the target says that it
> is profitable to do it.
>
> On X86, method 'isCheapToSpeculateCtlz' returns true only if the target
> has LZCNT. Method 'isCheapToSpeculateCttz' only returns true if the target
> has BMI.
> This may change in future. For now, I avoided to enable the transformation
> for all x86-64 targets with feature CMOV because I am not 100% it is always
> a win to speculate bsf/bsr. So, I left a couple of TODO comments in the
> code.


So, I actually have several piles of code that rely heavily on BSF/BSR
lowering performance and I can comment on the effect of CMOV here.

CMOV, at least on a large number of x86 processors, will not actually allow
the BSF/BSR to be skipped. Essentially, both inputs to the CMOV have to
complete. We actually use to lower the cttz and ctlz intrinsics (a *very*
long time ago) using BSF/BSR and a CMOV on x86 and the performance was
dramatically improved just by using a conditional branch.

So you can probably nuke the TODO and just comment that we really don't
want to speculate this unless it ensures we can directly use LZCNT or
TZCNT...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141223/0fb27867/attachment.html>


More information about the llvm-commits mailing list