[llvm-bugs] [Bug 41138] New: missed opt: target-feature propagation

via llvm-bugs llvm-bugs at lists.llvm.org
Tue Mar 19 05:51:43 PDT 2019


https://bugs.llvm.org/show_bug.cgi?id=41138

            Bug ID: 41138
           Summary: missed opt: target-feature propagation
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Scalar Optimizations
          Assignee: unassignedbugs at nondot.org
          Reporter: gonzalobg88 at gmail.com
                CC: llvm-bugs at lists.llvm.org

Minimal working example - this Rust code:

extern "C" {
   #[target_feature(enable = "avx2")] pub fn foo();
}
pub unsafe fn bar() { foo() }

generates the following LLVM-IR (https://rust.godbolt.org/z/fzcraS):

define void @bar() unnamed_addr #0 {
  tail call void @foo()
  ret void
}

declare void @foo() unnamed_addr #1

attributes #0 = { nounwind nonlazybind uwtable
"probe-stack"="__rust_probestack" "target-cpu"="x86-64" }
attributes #1 = { nounwind nonlazybind uwtable
"probe-stack"="__rust_probestack" "target-cpu"="x86-64"
"target-features"="+avx2" }

which `opt` does not optimize further (https://rust.godbolt.org/z/XgMyCJ). 

Note that `foo` has the "target-features"="+avx2", but this is not propagated
to `bar`, which can significantly impact code generation and other
optimizations (e.g. if `bar` contained loops, those could use AVX2
instructions).

Propagating `avx2` to `bar` in this case is sound, because if `foo` is called
on a platform without `avx2` support, the behavior is undefined. That is, we
can assume that `foo` will only be called on platforms where `avx2` is enabled.

In general, if a function is unconditionally called, we can propagate its
target-features to the caller. If a function is only conditionally called, more
complex analysis is required, e.g., for this code

extern "C" {
   #[target_feature(enable = "avx2")] pub fn foo();
   #[target_feature(enable = "avx2")] pub fn baz();
}
pub unsafe fn bar(x: i32) { 
    if x == 0 { foo() } else { baz() }
}

which produces this LLVM-IR (https://rust.godbolt.org/z/XT4Hpo):

define void @bar(i32 %x) unnamed_addr #0 {
  %0 = icmp eq i32 %x, 0
  br i1 %0, label %bb1, label %bb2

bb1:                                              ; preds = %start
  tail call void @foo()
  br label %bb3

bb2:                                              ; preds = %start
  tail call void @baz()
  br label %bb3

bb3:                                              ; preds = %bb2, %bb1
  ret void
}

declare void @foo() unnamed_addr #1
declare void @baz() unnamed_addr #1

attributes #0 = { nounwind nonlazybind uwtable
"probe-stack"="__rust_probestack" "target-cpu"="x86-64" }
attributes #1 = { nounwind nonlazybind uwtable
"probe-stack"="__rust_probestack" "target-cpu"="x86-64"
"target-features"="+avx2" }

The optimization is also sound: `bar` should also have the `+avx2`
target-feature.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190319/bbed5c44/attachment-0001.html>


More information about the llvm-bugs mailing list