[Mlir-commits] [mlir] 64e171c - Avoid unnecessary output buffer allocation and initialization.

Thu Dec 9 08:29:09 PST 2021

Author: Bixia Zheng
Date: 2021-12-09T08:29:02-08:00
New Revision: 64e171c2d0c34536502b7707184600ff68eff3d4

URL: https://github.com/llvm/llvm-project/commit/64e171c2d0c34536502b7707184600ff68eff3d4
DIFF: https://github.com/llvm/llvm-project/commit/64e171c2d0c34536502b7707184600ff68eff3d4.diff

LOG: Avoid unnecessary output buffer allocation and initialization.

The sparse tensor code generator allocates memory for the output tensor. As
such, we only need to allocate a MemRefDescriptor to receive the output tensor
and do not need to allocate and initialize the storage for the tensor.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D115292

Added: 
    

Modified: 
    mlir/test/Integration/Dialect/SparseTensor/python/test_SpMM.py

Removed: 
    


################################################################################
diff  --git a/mlir/test/Integration/Dialect/SparseTensor/python/test_SpMM.py b/mlir/test/Integration/Dialect/SparseTensor/python/test_SpMM.py
index b5887263cc1cb..b6a1eaa6bb93b 100644

--- a/mlir/test/Integration/Dialect/SparseTensor/python/test_SpMM.py
+++ b/mlir/test/Integration/Dialect/SparseTensor/python/test_SpMM.py
@@ -85,12 +85,14 @@ def build_compile_and_run_SpMM(attr: st.EncodingAttr, support_lib: str,
       np.float64)
   b = np.array([[1.0, 2.0], [4.0, 3.0], [5.0, 6.0], [8.0, 7.0]], np.float64)
   c = np.zeros((3, 2), np.float64)
-  out = np.zeros((3, 2), np.float64)
 
   mem_a = ctypes.pointer(ctypes.pointer(rt.get_ranked_memref_descriptor(a)))
   mem_b = ctypes.pointer(ctypes.pointer(rt.get_ranked_memref_descriptor(b)))
   mem_c = ctypes.pointer(ctypes.pointer(rt.get_ranked_memref_descriptor(c)))
-  mem_out = ctypes.pointer(ctypes.pointer(rt.get_ranked_memref_descriptor(out)))
+  # Allocate a MemRefDescriptor to receive the output tensor.
+  # The buffer itself is allocated inside the MLIR code generation.
+  ref_out = rt.make_nd_memref_descriptor(2, ctypes.c_double)()
+  mem_out = ctypes.pointer(ctypes.pointer(ref_out))
 
   # Invoke the kernel and get numpy output.
   # Built-in bufferization uses in-out buffers.