[llvm-bugs] [Bug 48975] New: Missed optimization: nonnull assumptions cannot reason over select

Sun Jan 31 10:03:28 PST 2021

https://bugs.llvm.org/show_bug.cgi?id=48975

            Bug ID: 48975
           Summary: Missed optimization: nonnull assumptions cannot reason
                    over select
           Product: new-bugs
           Version: trunk
          Hardware: All
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: new bugs
          Assignee: unassignedbugs at nondot.org
          Reporter: miguel.ojeda.sandonis at gmail.com
                CC: htmldeveloper at gmail.com, llvm-bugs at lists.llvm.org

LLVM fails to optimize:

    define nonnull i32* @f(i32** %0) {
    start:
        %1 = load i32*, i32** %0, align 8
        %2 = icmp eq i32* %1, null
        %3 = bitcast i32** %0 to i32*
        %4 = select i1 %2, i32* null, i32* %3
        ; %assume = icmp ne i32* %4, null
        ; call void @llvm.assume(i1 %assume)
        ret i32* %4
    }

into:

    define nonnull i32* @f(i32** %0) {
    start:
        %1 = bitcast i32** %0 to i32*
        ret i32* %1
    }

which Alive2 confirms as valid (i.e. return type is nonnull => %4 is nonnull =>
the second branch of select was taken => %4 is %3).

Uncommenting the explicit assume (instead of relying on the nonnull return
attribute) doesn't help either. However, providing the assume directly on %2 or
a !nonnull on the load both work:

    define i32* @src(i32** %0) {
    start:
        %1 = load i32*, i32** %0, align 8, !nonnull !{}
        %2 = icmp eq i32* %1, null
        ; %assume = xor i1 %2, true
        ; call void @llvm.assume(i1 %assume)
        %3 = bitcast i32** %0 to i32*
        %4 = select i1 %2, i32* null, i32* %3
        ret i32* %4
    }

Thus it would seem like LLVM is not realizing that if %4 is nonnull, then the
select must come from the false branch, which implies %1 is false.

This is a reduction from Rust code such as:

    #![feature(option_result_unwrap_unchecked)]
    pub struct V(Vec<u8>);

    pub unsafe fn f(x: &mut Option<V>) -> &mut V {
        x.as_mut().unwrap_unchecked()
    }

There, the unwrap_unchecked() provides the nonnull assumption, while as_mut()
generates the select. However, if one manually expands the Rust code, LLVM
finds the optimization, though:

    pub struct V(Vec<u8>);

    pub unsafe fn f(x: &mut Option<V>) -> &mut V {
        let y = match x {
            Some(ref mut v) => Some(v),
            None => None,
        };

        match y {
            Some(v) => v,
            None => core::hint::unreachable_unchecked(),
        }
    }

because this ends up at:

    *** IR Dump After SROA ***
    ; Function Attrs: norecurse nounwind nonlazybind readonly uwtable
    define align 8 dereferenceable(24) %X*
@_ZN7example1f17h543f6e99cb8f036dE(%"std::option::Option<X>"* readonly align 8
dereferenceable(24) %0) unnamed_addr #0 !dbg !6 {
      %2 = bitcast %"std::option::Option<X>"* %0 to {}**, !dbg !10
      %3 = load {}*, {}** %2, align 8, !dbg !10
      %4 = icmp eq {}* %3, null, !dbg !10
      %5 = getelementptr inbounds %"std::option::Option<X>",
%"std::option::Option<X>"* %0, i64 0, i32 0, i64 0, !dbg !10
      %6 = select i1 %4, i64* null, i64* %5, !dbg !10
      %7 = icmp eq i64* %6, null, !dbg !11
      br i1 %7, label %8, label %9, !dbg !11

    8:                                                ; preds = %1
      unreachable, !dbg !12

    9:                                                ; preds = %1
      %10 = bitcast i64* %6 to %X*, !dbg !13
      ret %X* %10, !dbg !14
    }

    *** IR Dump After Early CSE w/ MemorySSA ***
    ; Function Attrs: norecurse nounwind nonlazybind readonly uwtable
    define align 8 dereferenceable(24) %X*
@_ZN7example1f17h543f6e99cb8f036dE(%"std::option::Option<X>"* readonly align 8
dereferenceable(24) %0) unnamed_addr #0 !dbg !6 {
      %2 = bitcast %"std::option::Option<X>"* %0 to {}**, !dbg !10
      %3 = load {}*, {}** %2, align 8, !dbg !10
      %4 = icmp eq {}* %3, null, !dbg !10
      %5 = getelementptr inbounds %"std::option::Option<X>",
%"std::option::Option<X>"* %0, i64 0, i32 0, i64 0, !dbg !10
      %6 = select i1 %4, i64* null, i64* %5, !dbg !10
      br i1 %4, label %7, label %8, !dbg !11

    7:                                                ; preds = %1
      unreachable, !dbg !12

    8:                                                ; preds = %1
      %9 = bitcast i64* %6 to %X*, !dbg !13
      ret %X* %9, !dbg !14
    }

i.e. the branch on %7 (after the select) is transformed into a branch on %4
(before the select).

However, when Rust code uses unwrap_unchecked() and similar functions, LLVM
optimizes those first, then integrates, but by then the branches aren't there
anymore (similar to what is shown in the reduction), and the optimization above
does not take place. Then, since it cannot reason over the select, the
suboptimal code is generated.

In Rust, using macros instead of functions for those that contain unreachable()
hints can be a workaround.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210131/b770420d/attachment-0001.html>