[llvm-bugs] [Bug 49738] New: [Matrix] LowerMatrixIntrinsics should preserve existing fast-math flags during lowering

Sat Mar 27 05:37:37 PDT 2021

https://bugs.llvm.org/show_bug.cgi?id=49738

            Bug ID: 49738
           Summary: [Matrix] LowerMatrixIntrinsics should preserve
                    existing fast-math flags during lowering
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Scalar Optimizations
          Assignee: unassignedbugs at nondot.org
          Reporter: florian_hahn at apple.com
                CC: llvm-bugs at lists.llvm.org

Currently LowerMatrixIntrinsics does not add existing fast-math flags from
matrix intrinsics & other instructions with shape information to the lowered
instructions.

For the example below, `opt -lower-matrix-intrinsics` creates fmuladd/fadd/fmul
instructions without `fast`: https://godbolt.org/z/1o48Tx1bP

This also means we fail to fold redundant FP instructions. In this case, we end
up with fmuladd calls with operands that are zero and can be simplified with
`fast`:  https://godbolt.org/z/5oWs1oh91

define <4 x float> @foo(<4 x float> %m, float %x, float %y) {
  %i1 = insertelement <4 x float> <float poison, float 0.000000e+00, float
0.000000e+00, float poison>, float %x, i64 0
  %i2 = insertelement <4 x float> %i1, float %y, i64 3
  %res = tail call fast <4 x float> @llvm.matrix.multiply.v4f32.v4f32.v4f32(<4
x float> %m, <4 x float> %i1, i32 2, i32 2, i32 2)
  %res.2 = fadd fast <4 x float> %res, %m
  ret <4 x float> %res
}

declare <4 x float> @llvm.matrix.multiply.v4f32.v4f32.v4f32(<4 x float>, <4 x
float>, i32, i32, i32)

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210327/b4fc82f9/attachment.html>