[flang-commits] [flang] 36c2a9d - [flang][runtime] New APIs for copyin/copyout of non-contiguous objects.

Slava Zakharin via flang-commits flang-commits at lists.llvm.org
Wed Oct 26 11:06:57 PDT 2022


Author: Slava Zakharin
Date: 2022-10-26T11:06:26-07:00
New Revision: 36c2a9d54ddfaec123859714cad6073edf468b49

URL: https://github.com/llvm/llvm-project/commit/36c2a9d54ddfaec123859714cad6073edf468b49
DIFF: https://github.com/llvm/llvm-project/commit/36c2a9d54ddfaec123859714cad6073edf468b49.diff

LOG: [flang][runtime] New APIs for copyin/copyout of non-contiguous objects.

The intention is to use these APIs for copyin/copyout of subprogram
arguments at the call sites. Currently, Flang generates loop nests
to do this, and in some corner cases this results in very long
compilation times due to LLVM loop optimizations.

For example, Flang produces 25245 loops for 521.wrf/module_dm.f90.
If we extract the copyin/copyout loops into runtime, Flang will only
produce 207 loops, and the compilation time may reduce by 47x.

Given that the copyin/copyout loop nests can not be fused with other
loop nests, extracting them into runtime functions should not reduce
performance if the runtime optimizes the leading contiguous dimension
copies.

The implementation will come in separate patches.

Differential Revision: https://reviews.llvm.org/D136378

Added: 
    

Modified: 
    flang/include/flang/Runtime/support.h

Removed: 
    


################################################################################
diff  --git a/flang/include/flang/Runtime/support.h b/flang/include/flang/Runtime/support.h
index 532fc538a2b2e..965141a8ad923 100644
--- a/flang/include/flang/Runtime/support.h
+++ b/flang/include/flang/Runtime/support.h
@@ -11,6 +11,8 @@
 #define FORTRAN_RUNTIME_SUPPORT_H_
 
 #include "flang/Runtime/entry-names.h"
+#include <cstddef>
+#include <cstdint>
 
 namespace Fortran::runtime {
 
@@ -21,6 +23,56 @@ extern "C" {
 // Predicate: is the storage described by a Descriptor contiguous in memory?
 bool RTNAME(IsContiguous)(const Descriptor &);
 
+/// Copy elements from \p source to a contiguous memory area denoted
+/// by \p destination. The caller must guarantee that the destination
+/// buffer is big enough to hold all elements from \p source, and also
+/// that its alignment satisfies the minimal alignment required
+/// for the elements of \p source. \p destByteSize is the size in bytes
+/// of the destination buffer, and is only used for checking for overflows
+/// of the buffer.
+/// The runtime implementation is optimized to make reads from \p source
+/// efficiently by identifying contiguity in the leading dimensions (if any).
+///
+/// The implementation assumes that \p source and \p destination elements'
+/// locations never overlap.
+void RTNAME(PackContiguous)(
+    void *destination, const Descriptor &source, std::size_t destByteSize);
+
+/// Copy element from contiguous memory area denoted by \p source into
+/// \p destination. The caller must guarantee that the source buffer
+/// contains enough elements to be copied into \p destination, and also
+/// that its alignment satisfies the minimal alignment required
+/// for the elements of \p destination. \p destByteSize is the size in bytes
+/// of the source buffer, and is only used for checking for overruns
+/// of the buffer.
+/// The runtime implementation is optimized to make writes into \p destination
+/// efficiently by identifying contiguity in the leading dimensions (if any).
+///
+/// The implementation assumes that \p source and \p destination elements'
+/// locations never overlap.
+void RTNAME(UnpackContiguous)(const Descriptor &destination, const void *source,
+    std::size_t sourceByteSize);
+
+/// If \p source specifies contiguous storage in memory, then
+/// the returned address matches source.base_addr, otherwise,
+/// if \p destination is not null, then the function copies
+/// all elements from \p source into the destination buffer
+/// and returns \p destination, otherwise, the function returns
+/// a pointer to newly allocated contiguous buffer containing
+/// all elements from \p source (e.g. copied with RTNAME(PackContiguous)).
+///
+/// If \p destination is not null, then the caller must guarantee
+/// that the destination buffer is big enough to hold all elements
+/// from \p source, and also that its alignment satisfies the minimal
+/// alignment required for the elements of \p source.
+/// \p destByteSize is the size in bytes of the destination buffer,
+/// and is only used for checking for overflows of the buffer.
+///
+/// \p shouldFree is set to 1, if the function allocates new memory,
+/// otherwise, it is set to 0.
+void *RTNAME(MakeContiguous)(void *destination, const Descriptor &source,
+    std::size_t destByteSize, std::uint8_t *shouldFree);
+
 } // extern "C"
 } // namespace Fortran::runtime
 #endif // FORTRAN_RUNTIME_SUPPORT_H_


        


More information about the flang-commits mailing list