<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/129779>129779</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[flang] surprising performance loss with nested type operator overloading
</td>
</tr>
<tr>
<th>Labels</th>
<td>
flang
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
ivan-pi
</td>
</tr>
</table>
<pre>
I've attempted to create a performance benchmark which sums an array of numbers, but in different ways to measure the overhead of operator overloading for simple value types:
[abstraction_penalty.F90.txt](https://github.com/user-attachments/files/19077916/abstraction_penalty.F90.txt)
When I run the program, I see the output:
```
$ flang-new -O2 abstraction_penalty.F90
$ ./a.out
[info] compiler: Homebrew flang version 19.1.4 (https://github.com/Homebrew/homebrew-core/issues)
[info] compiler options: flang-new -O2 abstraction_penalty.F90
[info] using naive sum
[info] number of iterations: 25000
test absolute additions ratio with
number time (sec) per second test0
0 0.0532 9.400E+02 1.000
1 0.0498 1.003E+03 0.937
2 0.0493 1.015E+03 0.926
3 0.0526 9.511E+02 0.988
4 0.0595 8.410E+02 1.118
5 0.0515 9.700E+02 0.969
6 0.0486 1.029E+03 0.913
7 0.0485 1.031E+03 0.912
8 0.0490 1.020E+03 0.922
9 0.0472 1.059E+03 0.888
10 0.0483 1.036E+03 0.907
11 0.0485 1.031E+03 0.912
12 0.0479 1.044E+03 0.901
13 0.0481 1.039E+03 0.905
14 6.7735 7.382E+00 127.336
15 6.7167 7.444E+00 126.267
16 0.0467 1.071E+03 0.878
17 0.0452 1.105E+03 0.850
18 0.0451 1.108E+03 0.849
19 0.0452 1.105E+03 0.850
20 0.0476 1.050E+03 0.895
21 0.0469 1.066E+03 0.882
22 0.0467 1.071E+03 0.877
23 0.0461 1.086E+03 0.866
24 0.0454 1.101E+03 0.853
25 0.0452 1.105E+03 0.851
26 0.0456 1.097E+03 0.857
27 0.0454 1.102E+03 0.853
28 6.6540 7.514E+00 125.089
29 6.5274 7.660E+00 122.709
------------------------------------------------
mean 0.0928 5.386E+02 1.75
```
The slow cases (14, 15, 28, 29) are calling the procedure `test_ddd`, which calls `dsum` for the `type(ddd)`, which is really just a double value but defined in a obscure way:
```fortran
integer, parameter :: dp = c_double
! Double wrapper
type :: dd
real(dp) :: val
end type
! Double wrapper child with TBP
type, extends(dd) :: ddi
contains
procedure :: get => get_ddi_val
end type
! Double wrapper wrapper
type :: ddd
type(dd) :: val
end type
```
The sum procedure looks as follows:
```fortran
pure function ddd_sum(a) result(res)
type(ddd), intent(in) :: a(:)
type(ddd) :: res
real(dp), pointer :: t(:)
#if USE_INTRINSIC_SUM
res%val%val = sum(a%val%val)
#else
integer :: i
res = ddd(dd(0.0_dp))
do i = 1, size(a)
res = res + a(i)
end do
#endif
end function
```
where the `+` is the overloaded `operator(+)` defined as,
```fortran
pure function ddd_add(a,b) result(c)
type(ddd), intent(in) :: a, b
type(ddd) :: c
c%val%val = a%val%val + b%val%val
end function
```
If the intrinsic sum (`-DUSE_INTRINSIC_SUM`) is used instead, there are no observable penalties. There are other switches too, namely `-DUSE_INTRINSIC_REDUCE` which displays good performance, and `-DUSE_STRUCTURE_CONSTRUCTOR` which makes the performance even worse (300x slower than the baseline).
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJycWE1v47rO_jXuhqghyd-LLKZfeLt4Zy5mWtxlodhMojO2ZEhyMr2__oKyk9hJT2fODYqksfmQDx8ykmjpnNpqxFWU3UXZw40c_M7YldpLfdurm7Vp3lfPkSj2CNJ77HqPDXgDtUXpEST0aDfGdlLXCGvU9a6T9iccdqregRs6B1KDtFa-g9mAHro1WheJe1gPHpSGRm02aFF7OMh3R547lG6wCH6HYPZodygbwpoerfTGhoutkY3SW9gYC051fYuwl-2A4N97dFHyJWLhL7uTa-etrL0y-q1HLVv_Hj9VLPa_fJQ9RKLced8HhHiKxNNW-d2wjmvTReJpcGhvpfey3nWovYvE00a1SJ-8YkVR8TwST59FENVI5N871PAMdtAhr96arZUd6fAMDqdkB98P_sw9Z9Mf-xKJFDat1NtbjQe4_Sbgb4LCZBwTr9gMfhRB6Y2JsgeoTderFm2UfIH_Mx2uLR5Gx7BH65TRwKuYxyl8LswRG4mn3fTvbW0sRuJJOTeQQtXHkcH0xJn8_llGSzeDo6prqfZI3bW8OXYX9Yry1CvHOCJjjI2iwvTy6Dx9yrUz7UCN3DQqAAACEg7K706AyTN41SFJ47CORAXU_OCwNrqZ-T2GYuMlFrMsEQBQxSljj5G4Y2Ky5vFIDM4vfgKlVRkMkgBJ4HijSoolRMwgyeiVZ5cgkRMoOVMSeaCUcb6gxOKqLJf-0zOoyqCMU36ZBecXkOwM4VmIU1ykzuIqpxaB_My-zEf2orpkz5Ol_2IGygiS8CuIWELKmUpsisOuVAqg6mxaiNE0u6RUXqjE2YxSEijll97ZsnCcL7KAz_LgsyIX1WibplcB-DJAMgvAg_srZVm2hEzFzuOiSIhTESelCKAxQS6KOElCM_HsZMvzItimE6ejbR6L_CLpWb3zgjgVlymXxYW0s2pnYz04u2zwMgs_JD4rc8Yn2_LSNq2WAapFgE_cn15iVu5iatvssp3KKogrZnXOp9rll81Rlst-FeIPdFpKK2blzsfcWXkVJw-1E-ks5XTS6SpAtvzZiewPdFq2oJiVO5t0qoorUEhEFB9wEr_jNBU8j_MsZVDEGV-2YBazclluUR0hmSjS0Ld5ziYQFyIuGAFu_-FrFqNDqSmNKrDL4mQqw3HxAx4X2eUWz7687BBcaw5QS4eO9hme0hGBZ_QuyvBe0cYjLUIt25Y2w-k8UWNDp6YoZ7QHvTVNQ37F_XQUI2tHdxvaNnMWTk4EJcB7j5EoCSKqBUo5sCjb9h3-GpwHCY0Z1qezFh3hGtwojQ0d5SSYtauJxEG-Xx9jNsZ6K_Ukk9Iet2gpVC-t7NCjBQIlX6DpIUoeoH4bo5137khweBgZHKzse7TTDcrghG5mlSD2lFpPqk0Ge9lOFqibAP0sAtQ71TbhOAAvd_-aBSTu-MujblxQbxaiadRkWBvtpdKOvs6qNJpt0VOmUfJI_741jXr7h-Q-k2Guw6nEv9Xhg5Ycuhn11pifDqSDjWlbczietT-uck-IzaDDqY4ovVH3iVISDYtuaH0kSns8L16xDQ0p7kO3aDJVepaAjEQZzqd_iz1aUoSPmyI0oCH_p_bzc7eRSNQGXn88vj1_ffn-_PXH8_3bj9f_J2-BdkYqhvfQs6f0ztdPfrB1OGMx_QKOUdWCoAveQhKhaCWL2dtEeJ5tY0AFU06JOPUfnNRdHn5OHsOnuAvSqcmMqt-YiaNu1GbWFcfazUtMM8Zhh9OIFtaLO7qs3Glmo_EMG7p3nNpIU3E3ri-nRUPSJPjpMnHdQDLIISNxv140Uf2_ttA9rH_XPjUZ1FfFlssr4g7Wi8J_LuOY9_MmiKa0t0o7VYdfG4mVs9uH67YjsStSenBh0XUeZUM5-FAQ2he0oXUY7V7SQjFOUgpdDC8nE0PW4A7K1zukoduQCy07bN_hg8jfHx9e7x-pcuO20CjXtzSub41p5k8AyI3UzdnHj5fvr_cvr98f3-6_fR2_fPt-dtTJnzh2zfwxAu5Rw8FYF2athLFfYVtE2rDkOECvpcNWaYxEFcNNs0qaKqnkDa54kXIuUsbEzW6VI5YCRV2xjNfpmqNIE5EnVV2LuhCb4katBBMZS1gqeJIIETPcZOtqU6SiyOq6aaKUYSdVG7ftvouN3d6E-XbFRVUU1U0r19i68OxEiDDORkJE2cONXRHgdj1sXZSyVjnvzi688m144DIiMlo2bG9VmG3nQrTGuXHn0ejCgxda4D96EHIz2Hb1ycBOsaeP296av7D2s1n9aUpnvxL_DQAA__-4isV7">