[PATCH] [SimplifyCFG] Teach when it is profitable to speculate calls to @llvm.cttz/ctlz.

Andrea Di Biagio Andrea_DiBiagio at sn.scee.net
Tue Dec 16 04:05:15 PST 2014


Hi majnemer, hfinkel, qcolombet,

Hi David, Hal, Quentin (and all),

If we know that the control flow is modelling an if-statement where the only instruction in 'then' basic block (excluding the terminator) is a call to cttz/ctlz, it may be beneficial to speculate the cttz/ctlz call and let SimplifyCFG convert the associated phi instruction in the 'end' basic block into a select.

Example:
;;
entry:
  %cmp = icmp eq i64 %Val, 0
  br i1 %cmp, label %end.bb, label %then.bb

then.bb:
  %c = tail call i64 @llvm.cttz.i64(i64 %val, i1 true)
  br label %EndBB

end.bb:
  %cond = phi i64 [ %c, %then.bb ], [ 64, %entry]
;;

The call to @llvm.cttz.i64 could be speculated. This would allow to fold the entire code sequence into:
  %cmp = icmp eq i64 %Val, 0
  %c = tail call i64 @llvm.cttz.i64(i64 %val, i1 false)
  %cond = select i1 %cmp, i64 64, i64 %c

The constraints are:
 a) The 'then' basic block is taken only if the input operand to the cttz/ctlz is different than zero;
 b) The phi node propagates the size-of (in bits) of the value %val in input to the cttz/ctlz if %val is zero.

If all these constraints are met, the optimizer can hoist the call to cttz/ctlz from the 'then' basic block into the 'entry' basic block. The phi instruction would then be replaced by a select statement.
The new cttz/ctlz instruction will also have the 'undef on zero' flag set to 'false'. This would allow the instruction combiner to further simplify the code by getting rid of the redundant select.

The IR from the example can be obtained from the following code:
///
unsigned long long foo(unsigned long long A) {
    return A ? __builtin_ctzll(A) : 64;
  }
///

On X86-64 targets with feature TZCNT, this patch would allow the backend to generate optimal assembly code for function 'foo':
  tzcntq %rdi, %rax
  retq

On X86-64 targets with no TZCNT, the call to cttz with the 'undef on zero' flag cleared would be expanded into a longer sequence involving a conditional move instruction:

  bsfq  %rdi, %rcx
  movl  $64, %eax
  cmovneq  %rc, %rax
  popq  %rbp
  retq

On X86 (not x86-64) targets with no LZCNT/TZCNT and no CMOV instructions, the backend would futher expand the conditional moves introducing machine basic blocks. Basically it would revert this optimization re-introducing the if-else structure.

My only questions are: is SimplifyCFG the correct place where to put this logic? If not, then where do you think I should put it?

Please, let me know what you think.

Thanks,
Andrea

http://reviews.llvm.org/D6679

Files:
  lib/Transforms/Utils/SimplifyCFG.cpp
  test/Transforms/SimplifyCFG/cttz-ctlz.ll

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D6679.17329.patch
Type: text/x-patch
Size: 11970 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141216/46de9f94/attachment.bin>


More information about the llvm-commits mailing list