[llvm] [flang-rt] Optimise ShallowCopy and elemental copies in Assign (PR #140569)

Slava Zakharin via llvm-commits llvm-commits at lists.llvm.org
Mon May 19 14:04:55 PDT 2025


================
@@ -114,40 +114,78 @@ RT_API_ATTRS void CheckIntegerKind(
   }
 }
 
+template <bool RANK1>
 RT_API_ATTRS void ShallowCopyDiscontiguousToDiscontiguous(
     const Descriptor &to, const Descriptor &from) {
-  SubscriptValue toAt[maxRank], fromAt[maxRank];
-  to.GetLowerBounds(toAt);
-  from.GetLowerBounds(fromAt);
+  DescriptorIterator<RANK1> toIt{to};
+  DescriptorIterator<RANK1> fromIt{from};
   std::size_t elementBytes{to.ElementBytes()};
   for (std::size_t n{to.Elements()}; n-- > 0;
-       to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) {
-    std::memcpy(
-        to.Element<char>(toAt), from.Element<char>(fromAt), elementBytes);
+      toIt.Advance(), fromIt.Advance()) {
+    // Checking the size at runtime and making sure the pointer passed to memcpy
+    // has a type that matches the element size makes it possible for the
+    // compiler to optimise out the memcpy calls altogether and can
+    // substantially improve performance for some applications.
+    if (elementBytes == 16) {
+      std::memcpy(toIt.template Get<__int128_t>(),
----------------
vzakhari wrote:

Please guard int128 usage with `#if defined USING_NATIVE_INT128_T`

https://github.com/llvm/llvm-project/pull/140569


More information about the llvm-commits mailing list