<html>

    <head>

      <base href="https://bugs.llvm.org/">

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW - InstCombine changes sibling non-fast instruction to fast"

   href="https://bugs.llvm.org/show_bug.cgi?id=46326">46326</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>InstCombine changes sibling non-fast instruction to fast

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>new-bugs

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>9.0

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>Linux

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>enhancement

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>new bugs

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>justin@willmert.me

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>htmldeveloper@gmail.com, llvm-bugs@lists.llvm.org

          </td>

        </tr></table>

      <p>

        <div>

        <pre>Overview:

This is a specific case found in Julia code with more context found in [1],

wherein it was suggested I file this as a bug with LLVM.

The unoptimized IR below contains two fast-math instructions just before

returning --- a multiplication followed by a square root. At an optimization

level of -O1 or higher, LLVM v8 optimizes away a bit of the function preamble,

but otherwise maintains the mathematical calculation. Starting with LLVM v9,

though, the fdiv (line %17 in unoptimized form) is changed to an fdiv fast

instruction and moved to the end of the function, just before the fmul fast.

The Julia Discourse forum helped reduce optimization from the set of -O1 to

specifically an effect of the InstCombine optimization pass being enabled.

Godbolt link demonstrating this: [2]

This changes the numerical result of the function, dependent on the version of

LLVM being used in Julia. I've noticed, though, that the llc-emitted assembly

is the same in both LLVM v8 and v9 as viewed on Godbolt, but Julia's emitted

assembly differs (in such a way that it's sensitive to the change in LLVM IR

optimization). I've put together a gist [3] that shows the assembly for Julia

v1.4.2 (LLVM v8), Julia v1.5-beta1 (LLVM v9), and LLVM llc run on the provided

IR (same in both v8 and v9).

Steps to Reproduce:

; Original unoptimized IR, to be optimized with

;   opt -march=x86-64 -mcpu=sandybridge -instcombine

; or replace '-instcombine' with {'-O1', '-O2', '-O3'}, for both LLVM v8 and

v9.

define double @"julia_coeff_\CE\B1_1367"(i64, i64) {

top:

  %2 = call %jl_value_t*** @julia.ptls_states()

  %3 = bitcast %jl_value_t*** %2 to %jl_value_t**

  %4 = getelementptr inbounds %jl_value_t*, %jl_value_t** %3, i64 4

  %5 = bitcast %jl_value_t** %4 to i64**

  %6 = load i64*, i64** %5

  %7 = sitofp i64 %0 to double

  %8 = sitofp i64 %1 to double

  %9 = fmul double 2.000000e+00, %7

  %10 = fadd double %9, 1.000000e+00

  %11 = fmul double 2.000000e+00, %7

  %12 = fsub double %11, 3.000000e+00

  %13 = fmul double %7, %7

  %14 = fmul double %8, %8

  %15 = fsub double %13, %14

  %16 = fmul double %12, %15

  %17 = fdiv double %10, %16

  %18 = fsub double %7, 1.000000e+00

  %19 = fmul double %18, %18

  %20 = fmul double 4.000000e+00, %19

  %21 = fsub double %20, 1.000000e+00

  %22 = fmul fast double %17, %21

  %23 = call fast double @llvm.sqrt.f64(double %22)

  ret double %23

}

Expected Results:

As I understand it (and correct me if I'm wrong), the fdiv should not be

changed to an fdiv fast instruction.

Build Date & Hardware:

julia> versioninfo()

Julia Version 1.5.0-beta1.0

Commit 6443f6c95a (2020-05-28 17:42 UTC)

Platform Info:

  OS: Linux (x86_64-pc-linux-gnu)

  CPU: AMD Ryzen 3 2200G with Radeon Vega Graphics

  WORD_SIZE: 64

  LIBM: libopenlibm

  LLVM: libLLVM-9.0.1 (ORCJIT, znver1)

Additional Information:

[1]

<a href="https://discourse.julialang.org/t/investigating-numerical-change-in-function-return-value-between-v1-4-vs-v1-5/41332/5">https://discourse.julialang.org/t/investigating-numerical-change-in-function-return-value-between-v1-4-vs-v1-5/41332/5</a>

[2] <a href="https://godbolt.org/z/r-cNuL">https://godbolt.org/z/r-cNuL</a>

[3] <a href="https://gist.github.com/jmert/6aea12adb74ef8b7f25eba276d42911a">https://gist.github.com/jmert/6aea12adb74ef8b7f25eba276d42911a</a></pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>