[llvm-bugs] [Bug 46326] New: InstCombine changes sibling non-fast instruction to fast

Mon Jun 15 07:14:09 PDT 2020

https://bugs.llvm.org/show_bug.cgi?id=46326

            Bug ID: 46326
           Summary: InstCombine changes sibling non-fast instruction to
                    fast
           Product: new-bugs
           Version: 9.0
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: new bugs
          Assignee: unassignedbugs at nondot.org
          Reporter: justin at willmert.me
                CC: htmldeveloper at gmail.com, llvm-bugs at lists.llvm.org

Overview:

This is a specific case found in Julia code with more context found in [1],
wherein it was suggested I file this as a bug with LLVM.

The unoptimized IR below contains two fast-math instructions just before
returning --- a multiplication followed by a square root. At an optimization
level of -O1 or higher, LLVM v8 optimizes away a bit of the function preamble,
but otherwise maintains the mathematical calculation. Starting with LLVM v9,
though, the fdiv (line %17 in unoptimized form) is changed to an fdiv fast
instruction and moved to the end of the function, just before the fmul fast.
The Julia Discourse forum helped reduce optimization from the set of -O1 to
specifically an effect of the InstCombine optimization pass being enabled.

Godbolt link demonstrating this: [2]

This changes the numerical result of the function, dependent on the version of
LLVM being used in Julia. I've noticed, though, that the llc-emitted assembly
is the same in both LLVM v8 and v9 as viewed on Godbolt, but Julia's emitted
assembly differs (in such a way that it's sensitive to the change in LLVM IR
optimization). I've put together a gist [3] that shows the assembly for Julia
v1.4.2 (LLVM v8), Julia v1.5-beta1 (LLVM v9), and LLVM llc run on the provided
IR (same in both v8 and v9).

Steps to Reproduce:

; Original unoptimized IR, to be optimized with
;   opt -march=x86-64 -mcpu=sandybridge -instcombine
; or replace '-instcombine' with {'-O1', '-O2', '-O3'}, for both LLVM v8 and
v9.
define double @"julia_coeff_\CE\B1_1367"(i64, i64) {
top:
  %2 = call %jl_value_t*** @julia.ptls_states()
  %3 = bitcast %jl_value_t*** %2 to %jl_value_t**
  %4 = getelementptr inbounds %jl_value_t*, %jl_value_t** %3, i64 4
  %5 = bitcast %jl_value_t** %4 to i64**
  %6 = load i64*, i64** %5
  %7 = sitofp i64 %0 to double
  %8 = sitofp i64 %1 to double
  %9 = fmul double 2.000000e+00, %7
  %10 = fadd double %9, 1.000000e+00
  %11 = fmul double 2.000000e+00, %7
  %12 = fsub double %11, 3.000000e+00
  %13 = fmul double %7, %7
  %14 = fmul double %8, %8
  %15 = fsub double %13, %14
  %16 = fmul double %12, %15
  %17 = fdiv double %10, %16
  %18 = fsub double %7, 1.000000e+00
  %19 = fmul double %18, %18
  %20 = fmul double 4.000000e+00, %19
  %21 = fsub double %20, 1.000000e+00
  %22 = fmul fast double %17, %21
  %23 = call fast double @llvm.sqrt.f64(double %22)
  ret double %23
}

Expected Results:

As I understand it (and correct me if I'm wrong), the fdiv should not be
changed to an fdiv fast instruction.

Build Date & Hardware:

julia> versioninfo()
Julia Version 1.5.0-beta1.0
Commit 6443f6c95a (2020-05-28 17:42 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: AMD Ryzen 3 2200G with Radeon Vega Graphics
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, znver1)

Additional Information:

[1]
https://discourse.julialang.org/t/investigating-numerical-change-in-function-return-value-between-v1-4-vs-v1-5/41332/5
[2] https://godbolt.org/z/r-cNuL
[3] https://gist.github.com/jmert/6aea12adb74ef8b7f25eba276d42911a

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20200615/dff9e8a8/attachment.html>