[llvm-bugs] [Bug 41138] New: missed opt: target-feature propagation
via llvm-bugs
llvm-bugs at lists.llvm.org
Tue Mar 19 05:51:43 PDT 2019
https://bugs.llvm.org/show_bug.cgi?id=41138
Bug ID: 41138
Summary: missed opt: target-feature propagation
Product: libraries
Version: trunk
Hardware: PC
OS: All
Status: NEW
Severity: enhancement
Priority: P
Component: Scalar Optimizations
Assignee: unassignedbugs at nondot.org
Reporter: gonzalobg88 at gmail.com
CC: llvm-bugs at lists.llvm.org
Minimal working example - this Rust code:
extern "C" {
#[target_feature(enable = "avx2")] pub fn foo();
}
pub unsafe fn bar() { foo() }
generates the following LLVM-IR (https://rust.godbolt.org/z/fzcraS):
define void @bar() unnamed_addr #0 {
tail call void @foo()
ret void
}
declare void @foo() unnamed_addr #1
attributes #0 = { nounwind nonlazybind uwtable
"probe-stack"="__rust_probestack" "target-cpu"="x86-64" }
attributes #1 = { nounwind nonlazybind uwtable
"probe-stack"="__rust_probestack" "target-cpu"="x86-64"
"target-features"="+avx2" }
which `opt` does not optimize further (https://rust.godbolt.org/z/XgMyCJ).
Note that `foo` has the "target-features"="+avx2", but this is not propagated
to `bar`, which can significantly impact code generation and other
optimizations (e.g. if `bar` contained loops, those could use AVX2
instructions).
Propagating `avx2` to `bar` in this case is sound, because if `foo` is called
on a platform without `avx2` support, the behavior is undefined. That is, we
can assume that `foo` will only be called on platforms where `avx2` is enabled.
In general, if a function is unconditionally called, we can propagate its
target-features to the caller. If a function is only conditionally called, more
complex analysis is required, e.g., for this code
extern "C" {
#[target_feature(enable = "avx2")] pub fn foo();
#[target_feature(enable = "avx2")] pub fn baz();
}
pub unsafe fn bar(x: i32) {
if x == 0 { foo() } else { baz() }
}
which produces this LLVM-IR (https://rust.godbolt.org/z/XT4Hpo):
define void @bar(i32 %x) unnamed_addr #0 {
%0 = icmp eq i32 %x, 0
br i1 %0, label %bb1, label %bb2
bb1: ; preds = %start
tail call void @foo()
br label %bb3
bb2: ; preds = %start
tail call void @baz()
br label %bb3
bb3: ; preds = %bb2, %bb1
ret void
}
declare void @foo() unnamed_addr #1
declare void @baz() unnamed_addr #1
attributes #0 = { nounwind nonlazybind uwtable
"probe-stack"="__rust_probestack" "target-cpu"="x86-64" }
attributes #1 = { nounwind nonlazybind uwtable
"probe-stack"="__rust_probestack" "target-cpu"="x86-64"
"target-features"="+avx2" }
The optimization is also sound: `bar` should also have the `+avx2`
target-feature.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190319/bbed5c44/attachment-0001.html>
More information about the llvm-bugs
mailing list