[PATCH] D102116: [LoopIdiom] 'logical right-shift until zero' ('count active bits') "on steroids" idiom recognition.

Roman Lebedev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 12 14:54:12 PDT 2021


lebedev.ri marked an inline comment as done.
lebedev.ri added inline comments.


================
Comment at: llvm/test/Transforms/LoopIdiom/X86/logical-right-shift-until-zero.ll:9-65
 ; Most basic pattern; Note that iff the shift amount is offset, said offsetting
 ; must not cause an overflow, but `add nsw` is fine.
 define i8 @p0(i8 %val, i8 %start, i8 %extraoffset) {
 ; CHECK-LABEL: @p0(
 ; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[VAL_NUMLEADINGZEROS:%.*]] = call i8 @llvm.ctlz.i8(i8 [[VAL:%.*]], i1 false)
+; CHECK-NEXT:    [[VAL_NUMACTIVEBITS:%.*]] = sub nuw nsw i8 8, [[VAL_NUMLEADINGZEROS]]
----------------
lebedev.ri wrote:
> zhuhan0 wrote:
> > lebedev.ri wrote:
> > > zhuhan0 wrote:
> > > > I could be wrong but would this mis-compile if %nbits results in unsigned overflow? For example,
> > > > ```
> > > > %val = 0x10000000
> > > > %start = 1
> > > > %extraoffset.= 255
> > > > ```
> > > > Loop trip count is 8 before transformation but 1 after.
> > > Could you please specify, for which bitwidth your counterexample is?
> > > I'm going to guess i32, so we have
> > > ```
> > > %iv = i32 1
> > > %nbits = i32 256
> > > %val.shifted = lshr i32 %val, 256
> > > ```
> > > We then navigate to https://llvm.org/docs/LangRef.html#lshr-instruction
> > > > If op2 is (statically or dynamically) equal to or larger than the number of bits in op1, this instruction returns a poison value.
> > > 
> > > So i'm not seeing a miscompile.
> > > 
> > > As i have said, i've verified each of the tests here with alive2, and they are all fine.
> > > 
> > 8 bits.
> > ```
> > %start = i8 1
> > %extraoffset = i8 255  ; unsigned
> > ```
> > Sorry I'm not familiar with alive2 so forgive me if I'm questioning something that's obviously proven to be correct.
> Then i do not understand what `%val = 0x10000000` means.
> What's the decimal value of `%val`?
Ah, you mean `%val = 128`.
So we have
```
$ cat /tmp/test.ll 
declare void @escape_inner(i8, i8, i8, i1, i8)
declare void @escape_outer(i8, i8, i8, i1, i8)

define i8 @p0() {
entry:
  br label %loop

loop:
  %iv = phi i8 [ 1, %entry ], [ %iv.next, %loop ]
  %nbits = add nsw i8 %iv, 255
  %val.shifted = lshr i8 128, %nbits
  %val.shifted.iszero = icmp eq i8 %val.shifted, 0
  %iv.next = add i8 %iv, 1

  call void @escape_inner(i8 %iv, i8 %nbits, i8 %val.shifted, i1 %val.shifted.iszero, i8 %iv.next)

  br i1 %val.shifted.iszero, label %end, label %loop

end:
  %iv.res = phi i8 [ %iv, %loop ]
  %nbits.res = phi i8 [ %nbits, %loop ]
  %val.shifted.res = phi i8 [ %val.shifted, %loop ]
  %val.shifted.iszero.res = phi i1 [ %val.shifted.iszero, %loop ]
  %iv.next.res = phi i8 [ %iv.next, %loop ]

  call void @escape_outer(i8 %iv.res, i8 %nbits.res, i8 %val.shifted.res, i1 %val.shifted.iszero.res, i8 %iv.next.res)

  ret i8 %iv.res
}
$ ./bin/opt -loop-idiom -mtriple=x86_64 -mcpu=core-avx2 -o - -S /tmp/test.ll | ./bin/opt -O3 -S -o - -
; ModuleID = '<stdin>'
source_filename = "/tmp/test.ll"
target triple = "x86_64"

declare void @escape_inner(i8, i8, i8, i1, i8) local_unnamed_addr #0

declare void @escape_outer(i8, i8, i8, i1, i8) local_unnamed_addr #0

define i8 @p0() local_unnamed_addr #0 {
entry:
  tail call void @escape_inner(i8 1, i8 0, i8 -128, i1 false, i8 2)
  tail call void @escape_inner(i8 2, i8 1, i8 64, i1 false, i8 3)
  tail call void @escape_inner(i8 3, i8 2, i8 32, i1 false, i8 4)
  tail call void @escape_inner(i8 4, i8 3, i8 16, i1 false, i8 5)
  tail call void @escape_inner(i8 5, i8 4, i8 8, i1 false, i8 6)
  tail call void @escape_inner(i8 6, i8 5, i8 4, i1 false, i8 7)
  tail call void @escape_inner(i8 7, i8 6, i8 2, i1 false, i8 8)
  tail call void @escape_inner(i8 8, i8 7, i8 1, i1 false, i8 9)
  tail call void @escape_inner(i8 9, i8 8, i8 undef, i1 true, i8 10)
  tail call void @escape_outer(i8 9, i8 8, i8 undef, i1 true, i8 10)
  ret i8 9
}

attributes #0 = { "target-cpu"="core-avx2" }

```

`@escape_inner()` is called 9 times, not 1.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102116/new/

https://reviews.llvm.org/D102116



More information about the llvm-commits mailing list