[llvm] [flang-rt] Optimise ShallowCopy and elemental copies in Assign (PR #140569)
Slava Zakharin via llvm-commits
llvm-commits at lists.llvm.org
Mon May 19 14:04:54 PDT 2025
================
@@ -114,40 +114,78 @@ RT_API_ATTRS void CheckIntegerKind(
}
}
+template <bool RANK1>
RT_API_ATTRS void ShallowCopyDiscontiguousToDiscontiguous(
const Descriptor &to, const Descriptor &from) {
- SubscriptValue toAt[maxRank], fromAt[maxRank];
- to.GetLowerBounds(toAt);
- from.GetLowerBounds(fromAt);
+ DescriptorIterator<RANK1> toIt{to};
+ DescriptorIterator<RANK1> fromIt{from};
std::size_t elementBytes{to.ElementBytes()};
for (std::size_t n{to.Elements()}; n-- > 0;
- to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) {
- std::memcpy(
- to.Element<char>(toAt), from.Element<char>(fromAt), elementBytes);
+ toIt.Advance(), fromIt.Advance()) {
+ // Checking the size at runtime and making sure the pointer passed to memcpy
+ // has a type that matches the element size makes it possible for the
+ // compiler to optimise out the memcpy calls altogether and can
+ // substantially improve performance for some applications.
+ if (elementBytes == 16) {
+ std::memcpy(toIt.template Get<__int128_t>(),
+ fromIt.template Get<__int128_t>(), elementBytes);
+ } else if (elementBytes == 8) {
----------------
vzakhari wrote:
I am not sure that we can rely on the size of the element and assume that the alignment satisfies the requirements of the data type that you expose to `memcpy`.
I think the most reliable way is to have special versions of `ShalloCopy` for all trivial Fortran data types, and call the different versions from the compiler.
https://github.com/llvm/llvm-project/pull/140569
More information about the llvm-commits
mailing list