[all-commits] [llvm/llvm-project] dfd1bb: [Matrix] Factor and distribute transposes across m...

Tue May 25 11:16:19 PDT 2021

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: dfd1bbd00ac09b84c76cc5980cee1deb68475a04
      https://github.com/llvm/llvm-project/commit/dfd1bbd00ac09b84c76cc5980cee1deb68475a04
  Author: Adam Nemet <anemet at apple.com>
  Date:   2021-05-25 (Tue, 25 May 2021)

  Changed paths:
    M llvm/lib/Transforms/Scalar/LowerMatrixIntrinsics.cpp
    M llvm/test/Transforms/LowerMatrixIntrinsics/remarks-inlining.ll
    M llvm/test/Transforms/LowerMatrixIntrinsics/remarks-shared-subtrees.ll
    A llvm/test/Transforms/LowerMatrixIntrinsics/transpose-opts.ll

  Log Message:
  -----------
  [Matrix] Factor and distribute transposes across multiplies

Now that we can fold some transposes into multiplies (CM: A * B^t and RM:
A^t * B), we want to move them around to create the optimal expressions:

* fold away double transposes while still using them to assert the shape
* sink transposes hoping they cancel out
* lift transposes when both operands are transposed

This also modifies the matrix remarks to include the number of exposed
transposes (i.e. transposes that we couldn't fold into a multiply).

The adjustment to the test remarks-inlining is a bit subtle: I am changing the
double transpose to a single transpose so that we don't remove it completely.
More importantly this changes some of the total instruction count, most
notable stores because we can no longer use a vector store.

Differential Revision: https://reviews.llvm.org/D102733