<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - Missed optimization: nonnull assumptions cannot reason over select"
href="https://bugs.llvm.org/show_bug.cgi?id=48975">48975</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Missed optimization: nonnull assumptions cannot reason over select
</td>
</tr>
<tr>
<th>Product</th>
<td>new-bugs
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>All
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>new bugs
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>miguel.ojeda.sandonis@gmail.com
</td>
</tr>
<tr>
<th>CC</th>
<td>htmldeveloper@gmail.com, llvm-bugs@lists.llvm.org
</td>
</tr></table>
<p>
<div>
<pre>LLVM fails to optimize:
define nonnull i32* @f(i32** %0) {
start:
%1 = load i32*, i32** %0, align 8
%2 = icmp eq i32* %1, null
%3 = bitcast i32** %0 to i32*
%4 = select i1 %2, i32* null, i32* %3
; %assume = icmp ne i32* %4, null
; call void @llvm.assume(i1 %assume)
ret i32* %4
}
into:
define nonnull i32* @f(i32** %0) {
start:
%1 = bitcast i32** %0 to i32*
ret i32* %1
}
which Alive2 confirms as valid (i.e. return type is nonnull => %4 is nonnull =>
the second branch of select was taken => %4 is %3).
Uncommenting the explicit assume (instead of relying on the nonnull return
attribute) doesn't help either. However, providing the assume directly on %2 or
a !nonnull on the load both work:
define i32* @src(i32** %0) {
start:
%1 = load i32*, i32** %0, align 8, !nonnull !{}
%2 = icmp eq i32* %1, null
; %assume = xor i1 %2, true
; call void @llvm.assume(i1 %assume)
%3 = bitcast i32** %0 to i32*
%4 = select i1 %2, i32* null, i32* %3
ret i32* %4
}
Thus it would seem like LLVM is not realizing that if %4 is nonnull, then the
select must come from the false branch, which implies %1 is false.
This is a reduction from Rust code such as:
#![feature(option_result_unwrap_unchecked)]
pub struct V(Vec<u8>);
pub unsafe fn f(x: &mut Option<V>) -> &mut V {
x.as_mut().unwrap_unchecked()
}
There, the unwrap_unchecked() provides the nonnull assumption, while as_mut()
generates the select. However, if one manually expands the Rust code, LLVM
finds the optimization, though:
pub struct V(Vec<u8>);
pub unsafe fn f(x: &mut Option<V>) -> &mut V {
let y = match x {
Some(ref mut v) => Some(v),
None => None,
};
match y {
Some(v) => v,
None => core::hint::unreachable_unchecked(),
}
}
because this ends up at:
*** IR Dump After SROA ***
; Function Attrs: norecurse nounwind nonlazybind readonly uwtable
define align 8 dereferenceable(24) %X*
@_ZN7example1f17h543f6e99cb8f036dE(%"std::option::Option<X>"* readonly align 8
dereferenceable(24) %0) unnamed_addr #0 !dbg !6 {
%2 = bitcast %"std::option::Option<X>"* %0 to {}**, !dbg !10
%3 = load {}*, {}** %2, align 8, !dbg !10
%4 = icmp eq {}* %3, null, !dbg !10
%5 = getelementptr inbounds %"std::option::Option<X>",
%"std::option::Option<X>"* %0, i64 0, i32 0, i64 0, !dbg !10
%6 = select i1 %4, i64* null, i64* %5, !dbg !10
%7 = icmp eq i64* %6, null, !dbg !11
br i1 %7, label %8, label %9, !dbg !11
8: ; preds = %1
unreachable, !dbg !12
9: ; preds = %1
%10 = bitcast i64* %6 to %X*, !dbg !13
ret %X* %10, !dbg !14
}
*** IR Dump After Early CSE w/ MemorySSA ***
; Function Attrs: norecurse nounwind nonlazybind readonly uwtable
define align 8 dereferenceable(24) %X*
@_ZN7example1f17h543f6e99cb8f036dE(%"std::option::Option<X>"* readonly align 8
dereferenceable(24) %0) unnamed_addr #0 !dbg !6 {
%2 = bitcast %"std::option::Option<X>"* %0 to {}**, !dbg !10
%3 = load {}*, {}** %2, align 8, !dbg !10
%4 = icmp eq {}* %3, null, !dbg !10
%5 = getelementptr inbounds %"std::option::Option<X>",
%"std::option::Option<X>"* %0, i64 0, i32 0, i64 0, !dbg !10
%6 = select i1 %4, i64* null, i64* %5, !dbg !10
br i1 %4, label %7, label %8, !dbg !11
7: ; preds = %1
unreachable, !dbg !12
8: ; preds = %1
%9 = bitcast i64* %6 to %X*, !dbg !13
ret %X* %9, !dbg !14
}
i.e. the branch on %7 (after the select) is transformed into a branch on %4
(before the select).
However, when Rust code uses unwrap_unchecked() and similar functions, LLVM
optimizes those first, then integrates, but by then the branches aren't there
anymore (similar to what is shown in the reduction), and the optimization above
does not take place. Then, since it cannot reason over the select, the
suboptimal code is generated.
In Rust, using macros instead of functions for those that contain unreachable()
hints can be a workaround.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>