[PATCH] D94015: [LoopIdiom] Replace cttz loop by call to cttz intrinsic.

Dawid Jurczak via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 20 05:43:57 PDT 2022


yurai007 added a comment.

In D94015#3460499 <https://reviews.llvm.org/D94015#3460499>, @eiffel wrote:

> Hi.
>
> I updated my code to adress the `const` nit.
>
> However, I am not really sure of the output for cttz64 when ran with `-O1`.
> Indeed, I get:
>
>   ; ./bin/opt -O1 -S < ../llvm-project/llvm/test/Transforms/LoopIdiom/cttz.ll
>   ; ModuleID = '<stdin>'
>   source_filename = "<stdin>"
>   
>   ; Function Attrs: nofree norecurse nosync nounwind readnone uwtable
>   define i32 @cttz32(i32 %x) local_unnamed_addr #0 {
>   entry:
>     %0 = call i32 @llvm.cttz.i32(i32 %x, i1 true), !range !0
>     ret i32 %0
>   }
>   
>   ; Function Attrs: nofree norecurse nosync nounwind readnone ssp uwtable
>   define i32 @cttz64(i64 %x) local_unnamed_addr #1 {
>   entry:
>     br label %land.rhs
>   
>   land.rhs:                                         ; preds = %while.body, %entry
>     %i.06 = phi i64 [ 0, %entry ], [ %inc, %while.body ]
>     %0 = shl nuw i64 1, %i.06
>     %1 = and i64 %0, %x
>     %cmp1 = icmp eq i64 %1, 0
>     br i1 %cmp1, label %while.body, label %while.end.split.loop.exit2
>   
>   while.body:                                       ; preds = %land.rhs
>     %inc = add nuw nsw i64 %i.06, 1
>     %cmp = icmp ult i64 %i.06, 63
>     br i1 %cmp, label %land.rhs, label %while.end
>   
>   while.end.split.loop.exit2:                       ; preds = %land.rhs
>     %extract.t1.le = trunc i64 %i.06 to i32
>     br label %while.end
>   
>   while.end:                                        ; preds = %while.body, %while.end.split.loop.exit2
>     %i.0.lcssa.off0 = phi i32 [ %extract.t1.le, %while.end.split.loop.exit2 ], [ 64, %while.body ]
>     ret i32 %i.0.lcssa.off0
>   }
>   
>   ; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn
>   declare i32 @llvm.cttz.i32(i32, i1 immarg) #2
>   
>   attributes #0 = { nofree norecurse nosync nounwind readnone uwtable }
>   attributes #1 = { nofree norecurse nosync nounwind readnone ssp uwtable }
>   attributes #2 = { nocallback nofree nosync nounwind readnone speculatable willreturn }
>   
>   !0 = !{i32 0, i32 33}
>
> While the result is correct for cttz32, the intrinsic is not present for cttz64.
> Note that, when I ran this, I get an output which is correct for both:
>
>   ; ./bin/opt -loop-idiom -loop-deletion -S < ../llvm-project/llvm/test/Transforms/LoopIdiom/cttz.ll          
>   ; ModuleID = '<stdin>'
>   source_filename = "<stdin>"
>   
>   ; Function Attrs: norecurse nounwind readnone uwtable
>   define i32 @cttz32(i32 %x) #0 {
>   entry:
>     br label %while.end
>   
>   while.end:                                        ; preds = %entry
>     %0 = call i32 @llvm.cttz.i32(i32 %x, i1 true)
>     ret i32 %0
>   }
>   
>   ; Function Attrs: nounwind readnone ssp uwtable
>   define i32 @cttz64(i64 %x) #1 {
>   entry:
>     br label %while.end
>   
>   while.end:                                        ; preds = %entry
>     %0 = call i64 @llvm.cttz.i64(i64 %x, i1 true)
>     %conv = trunc i64 %0 to i32
>     ret i32 %conv
>   }
>   
>   ; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn
>   declare i32 @llvm.cttz.i32(i32, i1 immarg) #2
>   
>   ; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn
>   declare i64 @llvm.cttz.i64(i64, i1 immarg) #2
>   
>   attributes #0 = { norecurse nounwind readnone uwtable }
>   attributes #1 = { nounwind readnone ssp uwtable }
>   attributes #2 = { nocallback nofree nosync nounwind readnone speculatable willreturn }
>
> So, can someone with better experience give me his/her thoughts about this?
>
> Best regards and thank you in advance.

That's because -O1 runs pipeline of transformations and apparently cttz32 at the time of reaching LIR is expected and pattern is recognized successfully. 
However it's not the case for cttz64 - perhaps IR was mutated by previous passes and in consequence recognizeAndReplaceCTZ bails out on unexpected pattern.
If you want to find out what's going on then LLVM_DEBUG macro and -debug/-print-changed options may be useful for debugging purpose.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94015/new/

https://reviews.llvm.org/D94015



More information about the llvm-commits mailing list