[all-commits] [llvm/llvm-project] 4d9771: [flang] Improved performance of runtime Matmul/Mat...

Tue Aug 29 17:04:30 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 4d9771741d40cc9cfcccb6b033f43689d36b705a
      https://github.com/llvm/llvm-project/commit/4d9771741d40cc9cfcccb6b033f43689d36b705a
  Author: Slava Zakharin <szakharin at nvidia.com>
  Date:   2023-08-29 (Tue, 29 Aug 2023)

  Changed paths:
    M flang/runtime/matmul-transpose.cpp
    M flang/runtime/matmul.cpp
    M flang/unittests/Runtime/Matmul.cpp
    M flang/unittests/Runtime/MatmulTranspose.cpp

  Log Message:
  -----------
  [flang] Improved performance of runtime Matmul/MatmulTranspose.

This patch mostly affects performance of the code produced by
HLIFR lowering. If MATMUL argument is an array slice, then
HLFIR lowering passes the slice to the runtime, whereas
FIR lowering would create a contiguous temporary for the slice.
Performance might be better than the generic implementation
for cases where the leading dimension is contiguous.
This patch improves CPU2000/178.galgel making HLFIR version
faster than FIR version (due to avoiding the temporary copies
for MATMUL arguments).

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D159134