[PATCH] D94015: [LoopIdiom] Replace cttz loop by call to cttz intrinsic.
Dawid Jurczak via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 20 05:43:57 PDT 2022
yurai007 added a comment.
In D94015#3460499 <https://reviews.llvm.org/D94015#3460499>, @eiffel wrote:
> Hi.
>
> I updated my code to adress the `const` nit.
>
> However, I am not really sure of the output for cttz64 when ran with `-O1`.
> Indeed, I get:
>
> ; ./bin/opt -O1 -S < ../llvm-project/llvm/test/Transforms/LoopIdiom/cttz.ll
> ; ModuleID = '<stdin>'
> source_filename = "<stdin>"
>
> ; Function Attrs: nofree norecurse nosync nounwind readnone uwtable
> define i32 @cttz32(i32 %x) local_unnamed_addr #0 {
> entry:
> %0 = call i32 @llvm.cttz.i32(i32 %x, i1 true), !range !0
> ret i32 %0
> }
>
> ; Function Attrs: nofree norecurse nosync nounwind readnone ssp uwtable
> define i32 @cttz64(i64 %x) local_unnamed_addr #1 {
> entry:
> br label %land.rhs
>
> land.rhs: ; preds = %while.body, %entry
> %i.06 = phi i64 [ 0, %entry ], [ %inc, %while.body ]
> %0 = shl nuw i64 1, %i.06
> %1 = and i64 %0, %x
> %cmp1 = icmp eq i64 %1, 0
> br i1 %cmp1, label %while.body, label %while.end.split.loop.exit2
>
> while.body: ; preds = %land.rhs
> %inc = add nuw nsw i64 %i.06, 1
> %cmp = icmp ult i64 %i.06, 63
> br i1 %cmp, label %land.rhs, label %while.end
>
> while.end.split.loop.exit2: ; preds = %land.rhs
> %extract.t1.le = trunc i64 %i.06 to i32
> br label %while.end
>
> while.end: ; preds = %while.body, %while.end.split.loop.exit2
> %i.0.lcssa.off0 = phi i32 [ %extract.t1.le, %while.end.split.loop.exit2 ], [ 64, %while.body ]
> ret i32 %i.0.lcssa.off0
> }
>
> ; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn
> declare i32 @llvm.cttz.i32(i32, i1 immarg) #2
>
> attributes #0 = { nofree norecurse nosync nounwind readnone uwtable }
> attributes #1 = { nofree norecurse nosync nounwind readnone ssp uwtable }
> attributes #2 = { nocallback nofree nosync nounwind readnone speculatable willreturn }
>
> !0 = !{i32 0, i32 33}
>
> While the result is correct for cttz32, the intrinsic is not present for cttz64.
> Note that, when I ran this, I get an output which is correct for both:
>
> ; ./bin/opt -loop-idiom -loop-deletion -S < ../llvm-project/llvm/test/Transforms/LoopIdiom/cttz.ll
> ; ModuleID = '<stdin>'
> source_filename = "<stdin>"
>
> ; Function Attrs: norecurse nounwind readnone uwtable
> define i32 @cttz32(i32 %x) #0 {
> entry:
> br label %while.end
>
> while.end: ; preds = %entry
> %0 = call i32 @llvm.cttz.i32(i32 %x, i1 true)
> ret i32 %0
> }
>
> ; Function Attrs: nounwind readnone ssp uwtable
> define i32 @cttz64(i64 %x) #1 {
> entry:
> br label %while.end
>
> while.end: ; preds = %entry
> %0 = call i64 @llvm.cttz.i64(i64 %x, i1 true)
> %conv = trunc i64 %0 to i32
> ret i32 %conv
> }
>
> ; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn
> declare i32 @llvm.cttz.i32(i32, i1 immarg) #2
>
> ; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn
> declare i64 @llvm.cttz.i64(i64, i1 immarg) #2
>
> attributes #0 = { norecurse nounwind readnone uwtable }
> attributes #1 = { nounwind readnone ssp uwtable }
> attributes #2 = { nocallback nofree nosync nounwind readnone speculatable willreturn }
>
> So, can someone with better experience give me his/her thoughts about this?
>
> Best regards and thank you in advance.
That's because -O1 runs pipeline of transformations and apparently cttz32 at the time of reaching LIR is expected and pattern is recognized successfully.
However it's not the case for cttz64 - perhaps IR was mutated by previous passes and in consequence recognizeAndReplaceCTZ bails out on unexpected pattern.
If you want to find out what's going on then LLVM_DEBUG macro and -debug/-print-changed options may be useful for debugging purpose.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D94015/new/
https://reviews.llvm.org/D94015
More information about the llvm-commits
mailing list