[PATCH] [InstCombine] Teach how to fold a select into a cttz/ctlz with the 'is_zero_undef' flag cleared.
Pete Cooper
peter_cooper at apple.com
Fri Jan 9 11:23:31 PST 2015
Hi Andrea
> On Jan 9, 2015, at 10:41 AM, Andrea Di Biagio <Andrea_DiBiagio at sn.scee.net> wrote:
>
> Hi Pete,
>
> In http://reviews.llvm.org/D6891#106740, @pete wrote:
>
>> Hi Andrea
>>
>> Is this intended to replace the work in CodeGenPrepare:r225274?
>
>
> No, It applies to a different scenario where the cttz/ctlz is always evaluated.
I see. Makes sense here.
>
> Something like:
>
> unsigned int foo(unsigned int x) {
> unsigned int count = __builtin_ctz(x);
> return x ? count : 32;
> }
>
> Where the count trailing zeroes is always evaluated before reaching the conditional statement (which is then converted into a select).
>
> The logic added in CodeGenPrepare would work in a different scenario (see below):
>
> unsigned int bar(unsigned int x) {
> return x ? __builtin_ctz(x) : 32;
> }
>
> In this case, the builtin call is not always executed since it is not dominating the control flow (it is in the 'then' part). Depending on the target, it may or may not be beneficial to speculate that builtin call.
>
>> If you match this to a single intrinsic call in instcombine then it would be relatively simple for SimplifyCFG to then speculate it with existing code. Then we can remove the code from CGP?
>
>
> The problem with implementing that logic into SimplifyCFG is that we need to query the target to check if calls to cttz/ctlz are cheap to speculate. Therefore, in code review http://reviews.llvm.org/D6679 it was suggested by the reviewers to move that logic into CodeGenPrepare.
So I agree if you do more than speculate a single cttz. In that case, you’d be hoisting say 2 instructions out, and flattening some control flow. However, If you are trying to flatten the specific pattern you gave:
define i64 @test1(i64 %A) {
; CHECK-LABEL: @test1(
; CHECK: [[CTLZ:%[A-Za-z0-9]+]] = call i64 @llvm.ctlz.i64(i64 %A, i1 false)
; CHECK-NEXT: select i1 %tobool, i64 64, i64 [[CTLZ]]
entry:
%tobool = icmp eq i64 %A, 0
br i1 %tobool, label %cond.end, label %cond.true
cond.true: ; preds = %entry
%0 = tail call i64 @llvm.ctlz.i64(i64 %A, i1 true)
br label %cond.end
cond.end: ; preds = %entry, %cond.true
%cond = phi i64 [ %0, %cond.true ], [ 64, %entry ]
ret i64 %cond
}
then in this case, you’d flatten (speculate) it in SimplifyCFG, turn it in to the inverse intrinsic, and then later in CodeGen you’d generate this control flow again.
I know that seems a bit strange to flatten some control flow only to reintroduce it later, but flattening it might enable some other useful optimizations.
Thanks,
Pete
>
> I hope this clears up any misunderstanding.
>
> -Andrea
>
>> Thanks,
>
>> Pete
>
>
>
>
>
> http://reviews.llvm.org/D6891
>
> EMAIL PREFERENCES
> http://reviews.llvm.org/settings/panel/emailpreferences/
>
>
More information about the llvm-commits
mailing list