<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/134026>134026</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [Flang] `read` statement is slower than Gfortran in SPEC CPU 2017
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            flang
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          s-watanabe314
      </td>
    </tr>
</table>

<pre>
    When measuring 649.fotonik3d_s from SPEC CPU 2017 on the Grace (AArch64) machine, the speed of LLVM was slower than GCC as the number of threads increased. The options specified were `-Ofast -mcpu=neoverse-v2`.

||version|
|-|-|
|GCC|14.2.0|
|LLVM (Release build)|19.1.0|

||1 thread|2 threads|4 threads|8 threads|16 threads|
|-|-|-|-|-|-|
|GCC [s]|525.33|283.96|157.78|**99.93**|**76.15**|
|LLVM [s]|477.86|264.73|153.25|**101.72**|**81.36**|

According to our investigation, one of the reasons seems to be that the `read` statement, called in the non-parallelized initialization process, is slower than Gfortran. 
The `read` statement is called approximately 54 million times within the nested loop, so its impact is significant.

The difference in processing time of the `read` statement was verified using the test program `test.f90`. This test was measured using LLVM 20.1.0 built with the build option `-DCMAKE_BUILD_TYPE=Release`.
To minimize the impact of file systems like NFS, the input file `test.dat` was placed in the machine's local directory.
`test.dat` can be generated with the following command:

`python3 -c 'for i in range(54000000): print(i,i,i)' > test.dat`


- test.f90
```fortran
program main
  integer, parameter :: num = 54000000

  integer, dimension(num) :: a,b,c
  integer :: t1,t2, count, countmax
  integer :: ii

  open(unit=9, file='test.dat', status='old', IOSTAT=ios)
  call system_clock(t1, count, countmax)

  do ii=1,num
    read(9,*) a(ii),b(ii),c(ii)
  end do

  call system_clock(t2)
  print*, "init time : ",real(t2 - t1)/count,"sec"

 close(9)
end program main
```

- Compilation commands
```
$ flang-new -O3 -ffast-math -mcpu=neoverse-v2 test.f90
$ gfortran -O3 -ffast-math -mcpu=neoverse-v2 test.f90
```

- Measurement results

||version|time|
|-|-|-|
|GCC|14.2.0|22.496 [s]|
|LLVM (Release build)|20.1.0|42.341 [s]|


The `read` statement appears to be nearly twice as slow in Flang compared to Gfortran.
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJyUVt2O27oRfhr6ZmBBpmRLuvCFImeDgyZN0Oxp0auApkYWG4oUSGp9Nk9fDCVrf-Cg6MLGUvT8fDPzzYyE9-piEI9s_4HtTxsxhd66o99eRRBGnDHb5ZuzbZ-P_-rRwIDCT06ZCxzyKulssEb9zNofHjpnB_j-7WMDzbc_gae7AqyB0CN8ckIiMF7WtZP9IWe8gkHIXhlkvIkifkRswXbw-fM_v8BVePDaXtFB6IWBT00DwkdBMw1ndCQZeoei9aCMdCg8tgk89gh2DMoaTxal6hS2cEWHwA7p9msnfIDtIMeJZSeD9gmdx-0TZ4c0YWlNn6JhRUP3yhp6iFfb5Ts_fWpIZpcnPEnXy4ib8fIfqAkMnCelW8YrkqyS3Sp5c7Fb8LOi4bdQWNHkr87lq_Pu8OrhLaj33xUksP0Hz_YnVjR7vk-yjHyVWVIdyOC-SIqSxHnNeF1VSZXNx_WuOCS7_cvdqzBXu3lRJCWZ44c8KbJoN0v4frWxS3dJwd8ZLndJdnhjOK1rKa1riVfBgp0cKPOEPqiLoHISTazBuewIVO9YY8TBk8IZiSgh_sgOaUzsIQUfRMABTSB9KbTGFtTMSWPNdhSO7rT6Fe9VUEKrX9EhjM5K9J4U1TsydtYFJ0wCLK0ff-OQlBaHYhyd_UsNIqB-hn0Og9KaXAQ1oIerCv0NE_qALWhrR_LrLajgQQ2jkNEgNarqlBQmLHQl963qOnRoJFJsC-6YSDWsGbuLkdrsCd3cJdOs0yME9IHsXJwYSJGek65KqUvgsVd-liDteRqs2pEdPCW6xwYIMbpoNPbD0pyxGU_Nl_pvH398-POPz6cfj__-9pFlp6V5lnZ8tDAoowb1C6OJJRG2g05pBP_sA5Vfq58If3_4fhslyoxTmEVu4FsRKHJCPGohX1iwTqHCg7ZSaGiVQxmse44ZfqsvhSGmXdCgE1SpNbrOam2vlAJph0GYlmX10u2HdHwOvTUZbCUwXnTWgSL_TpgLMl7u8zT-0bDIahidIr6WivFm-VaMF8Cyj_AKzGI9rbew1id6mz8LR1la3wo5CEWPAMoEvKCjdFEDDBjQAeHNapquwLITrJiiizc6rRrQxOnISzMNNMsXZcF4c2a8ka9Vbj-GHeNN4LER7bR0JB0G8dc9eaVuvu2I5GsyKrDsVJEiFZdlJ8aLNSO8iC0TRJj8_JOl-Rtv__j6_bF-ZNlJWU_ZJKvUnAuFfkht5U_Gy4jxDryoErVaS8CyEwlS8HQHEDuLlwQtTrWKUlGquXKUkvUs1zNpommhtTfbdxHxm-xCi5qAMc5pWs0NTslinBLrUOioA9uY7orxh1swjHOPkuRmb1JbjzNmsk9I3hFlpdKNZo0dRqXn6biw3L-X4zl0WpjL1uAVtl8z2Ha0c7eDCP29xfuGuzyHy8Lb_1v3Pdgv82CKY86hn3Tw9xc85fDuSr276zlP8urwagH-z-U_T0NalTzJ8t071XWI353PYhxRuNuCMyicfoZwVRJheUGiOfJACaeKjIImcbAvG2rTHrO2yiqxweOuyLP9ocrKctMfhRRdmhbnTFRYSl4UVXHOeYeywLaTebpRR57yfZqnPM3zKi8Sjpx3Z5RYZmlb8T3LUxyE0onWT0Ni3WWjvJ_wuMvylB82WpxR-_hOyXmkBHFvf9q4Iylsz9PFszzVygf_YiKooOOLaIyJ7U-_3a33FjIl483752Zy-tiHMHoaKvyB8YeLCv10TqQdGH8gv8u_7ejsf1AGxh9iHJ7xhyWUpyP_bwAAAP___2pe8w">