[llvm-bugs] [Bug 39771] New: Funnel shift simplification based on demanded bits fails on multiple uses

via llvm-bugs llvm-bugs at lists.llvm.org
Sat Nov 24 11:12:44 PST 2018


https://bugs.llvm.org/show_bug.cgi?id=39771

            Bug ID: 39771
           Summary: Funnel shift simplification based on demanded bits
                    fails on multiple uses
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Windows NT
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Scalar Optimizations
          Assignee: unassignedbugs at nondot.org
          Reporter: nikita.ppv at gmail.com
                CC: llvm-bugs at lists.llvm.org

https://reviews.llvm.org/D54869 added InstCombine simplification for funnel
shifts based on demanded bits. However, the original motivating case from Rust
(https://github.com/rust-lang/rust/issues/56009) is not solved by this. A
reduced test case for this issue is given in the following and also present in
test/Transforms/InstCombine/fsh.ll:

declare i33 @llvm.fshr.i33(i33, i33, i33)

define i33 @fshr_multi_use(i33 %a) {
  %b = tail call i33 @llvm.fshr.i33(i33 %a, i33 %a, i33 1)
  %c = lshr i33 %b, 23
  %d = xor i33 %c, %b
  %e = and i33 %d, 31
  ret i33 %e
}

The expected simplification is:

define i33 @fshr_multi_use(i33 %a) {
  %b = lshr i33 %a, 1 
  %c = lshr i33 %a, 24
  %d = xor i33 %c, %b
  %e = and i33 %d, 31
  ret i33 %e
}

If the funnel shift is written explicitly as shifts+or, as in the following,
the simplification does take place:

define i33 @expanded_fshr_multi_use(i33 %a) {
  %tmp = lshr i33 %a, 1
  %tmp2 = shl i33 %a, 32
  %b = or i33 %tmp, %tmp2
  %c = lshr i33 %b, 23
  %d = xor i33 %c, %b
  %e = and i33 %d, 31
  ret i33 %e
}

InstCombine demanded bits simplifications generally only work if the
instruction has a single use. Only for or/and/xor multiple uses are handled, in
which case the operation may be reduced to one of its operands in some cases
(which happens here).

An alternative place where such a simplification could take place is BDCE,
which is based on the DemandedBits analysis. This will successfully handle
multiple uses, but exhibits a different issue: Demanded bits are only tracked
on the instruction level. If funnel shifts are used for rotates, then the first
two operands will be the same, and the demanded bits for both operands will be
combined, preventing this kind of simplification.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20181124/57e84add/attachment.html>


More information about the llvm-bugs mailing list