[PATCH] D29088: Do not create ctlz/cttz(X, false) when the target do not support zero defined ctlz/cttz.

Mehdi AMINI via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 26 08:35:43 PST 2017


mehdi_amini added a comment.

> Before CodeGenPrepare, we would end up with code like this:
> 
>   define i32 @_Z3fooi(i32 %n) local_unnamed_addr #0 {
>   entry:
>     %cmp = icmp eq i32 %n, -1
>     br i1 %cmp, label %cleanup, label %if.end
>   
>   if.end:                                           ; preds = %entry
>     %neg = xor i32 %n, -1
>     %0 = tail call i32 @llvm.ctlz.i32(i32 %neg, i1 false) #2   ;; <--- suboptimal flag!.
>     br label %cleanup
>   
>   cleanup:                                          ; preds = %entry, %if.end
>     %retval.0 = phi i32 [ %0, %if.end ], [ 32, %entry ]
>     ret i32 %retval.0
>   }

Can you clarify why the flag is suboptimal by itself? The intrinsic carries the same semantic as the unfolded sequence, isn't it?
This seems to me like just a missing optimization here to recover that at this point: can't we just figure that %neg can't be zero and turn the flag to true?



================
Comment at: include/llvm/Analysis/TargetTransformInfo.h:198
+  /// \brief Return if the target has a way to compute cttz/ctlz that is
+  /// defnied when the argument is zero.
+  bool hasZeroDefinedCtlz() const;
----------------
`s/defnied/defined`.


================
Comment at: test/Transforms/InstCombine/select-cmp-cttz-ctlz.ll:3
+; RUN: opt -S -instcombine -mattr=+bmi < %s | FileCheck %s --check-prefix=TZDEF
+; RUN: opt -S -instcombine -mattr=+lzcnt < %s | FileCheck %s --check-prefix=LZDEF
+
----------------
Some comment before the two new lines can be nice to have.

Also I suspect this won't work without the X86 backend configured in.


https://reviews.llvm.org/D29088





More information about the llvm-commits mailing list