[llvm] [flang-rt] Optimise ShallowCopy and elemental copies in Assign (PR #140569)

Slava Zakharin via llvm-commits llvm-commits at lists.llvm.org
Mon May 19 14:04:54 PDT 2025


================
@@ -114,40 +114,78 @@ RT_API_ATTRS void CheckIntegerKind(
   }
 }
 
+template <bool RANK1>
 RT_API_ATTRS void ShallowCopyDiscontiguousToDiscontiguous(
     const Descriptor &to, const Descriptor &from) {
-  SubscriptValue toAt[maxRank], fromAt[maxRank];
-  to.GetLowerBounds(toAt);
-  from.GetLowerBounds(fromAt);
+  DescriptorIterator<RANK1> toIt{to};
+  DescriptorIterator<RANK1> fromIt{from};
   std::size_t elementBytes{to.ElementBytes()};
   for (std::size_t n{to.Elements()}; n-- > 0;
-       to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) {
-    std::memcpy(
-        to.Element<char>(toAt), from.Element<char>(fromAt), elementBytes);
+      toIt.Advance(), fromIt.Advance()) {
+    // Checking the size at runtime and making sure the pointer passed to memcpy
+    // has a type that matches the element size makes it possible for the
+    // compiler to optimise out the memcpy calls altogether and can
+    // substantially improve performance for some applications.
+    if (elementBytes == 16) {
+      std::memcpy(toIt.template Get<__int128_t>(),
+          fromIt.template Get<__int128_t>(), elementBytes);
+    } else if (elementBytes == 8) {
----------------
vzakhari wrote:

I am not sure that we can rely on the size of the element and assume that the alignment satisfies the requirements of the data type that you expose to `memcpy`.

I think the most reliable way is to have special versions of `ShalloCopy` for all trivial Fortran data types, and call the different versions from the compiler.

https://github.com/llvm/llvm-project/pull/140569


More information about the llvm-commits mailing list