<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/91847>91847</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            flang-new array expression performance
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            flang
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          jeffhammond
      </td>
    </tr>
</table>

<pre>
    Array expression performance is ~5x worse than sequential loops for anything other than simple assignment.

I'm running on an AMD Zen4.

# Build
```
jehammond@oppenheimer:~/BabelStream/src/fortran$ make COMPILER=flang IMPLEMENTATION=Array
flang-new -O3 -DVERSION_STRING="5.0" -DUSE_ARRAY  -c BabelStreamTypes.F90
flang-new -O3 -DVERSION_STRING="5.0" -DUSE_ARRAY  -c ArrayStream.F90
flang-new -O3 -DVERSION_STRING="5.0" -DUSE_ARRAY  main.F90 ArrayStream.o BabelStreamTypes.o -o BabelStream.flang.Array
jehammond@oppenheimer:~/BabelStream/src/fortran$ make COMPILER=flang IMPLEMENTATION=Sequential
flang-new -O3 -DVERSION_STRING="5.0" -DUSE_SEQUENTIAL  -c BabelStreamTypes.F90
flang-new -O3 -DVERSION_STRING="5.0" -DUSE_SEQUENTIAL  -c SequentialStream.F90
flang-new -O3 -DVERSION_STRING="5.0" -DUSE_SEQUENTIAL  main.F90 SequentialStream.o BabelStreamTypes.o -o BabelStream.flang.Sequential
```

# Run
```
$ ./BabelStream.flang.Array -n 10 ; ./BabelStream.flang.Sequential -n 10
BabelStream Fortran
Version: 5.0
Implementation: Array
Running kernels 10 times
Precision: REAL64
Array size:     268.4MB
Total size:     805.3MB
Init: 0.2s (=   3837.6 MBytes/sec)
Read:       0.2s (=   3752.9 MBytes/sec)
Function    MBytes/sec  Min (sec)   Max Average
Copy        32182.843   0.01668     0.01720     0.01706
Mul 5511.874    0.09740     0.10650     0.10047
Add         7305.358 0.11024     0.11597     0.11346
Triad       7277.179    0.11066 0.12223     0.11406
Dot         23582.615   0.02277     0.02403 0.02346
BabelStream Fortran
Version:  5.0
Implementation: Sequential
Running kernels 10 times
Precision: REAL64
Array size: 268.4MB
Total size:     805.3MB
Init:       0.2s (=   3903.8 MBytes/sec)
Read:       0.2s (=   3583.2 MBytes/sec)
Function MBytes/sec  Min (sec)   Max         Average
Copy        31461.320 0.01706     0.01799     0.01772
Mul         30492.747   0.01761 0.01913     0.01822
Add         34026.178   0.02367     0.02494 0.02437
Triad       32883.429   0.02449     0.02493     0.02474
Dot 25825.380   0.02079     0.02106     0.02091
```

# Sources
https://github.com/UoB-HPC/BabelStream/blob/main/src/fortran/main.F90
https://github.com/UoB-HPC/BabelStream/blob/main/src/fortran/BabelStreamTypes.F90
https://github.com/UoB-HPC/BabelStream/blob/main/src/fortran/ArrayStream.F90
https://github.com/UoB-HPC/BabelStream/blob/main/src/fortran/SequentialStream.F90
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy8V01z4jgQ_TXiokIltWTLPnAwA-xSFZIsMFO1e5kSIMAZW2ZleybsYX77lvyBScjHVJJdVyqS6dfdT9Jzt63yPN4ZrQfIGyJv1FNlsc_s4E5vt3uVppnZ9FbZ5jiIrFVHrO8PVud5nBl80Hab2VSZtcZxjn969_hHZnONi70yONd_l9oUsUpwkmWHHG8zi5U5FvvY7HBW7LVtgHF6SDSuaaTaFATREaJR_X-KQKbYlsZUbgYrg6PZCP-ljXgARMDxsIyTTXPr0-avur3TzVqQoNnhoM1ex6m2iEc_EUyGaqWTRWG1ShFMcrtGMNlmtrDKIBA4Vd80_nQzu51ejeeIj7aJMjs8nd1ejWfj62W0nN5cIz6qNqhOVyH6Rv_A_RuO-6Mv4_lienP9dbGcT69_Q3yEADxCEQDujz4vxl-j-Tz6E-P-Gp9xWR4POieTkL47aEWtDvr-eKmKjYvyIGp2STzD_Qe_kiohOdum_-NUFicdvnHVi_Efn8fXy2l09dHn8yhyx_TdJ3Ue-XRcF_F__cweb-Kjp6t7AueleRoBApOHZ3ouB9w3mFGM-PAZVEeghtZRz3B40iijMnzR1pUoxCPs9qWuJK7MuPqiisZ0psR5U2C-aWt0kjsyRZzqvLbeWr2O24DzcXTli9pQk8_jf7SzuAv8gIjZsDYvs0IlD8wB9QhvzVMTF85ACeQYQYD4CGPMAy6Jj2fDY6FzJ3y9RhA2NLXatKHwYz_pAQmf9puUZu1W7bzOARjPYuNC1GBnVfc4-q6t2una9VN2ODb5MAcWAAkEr5JT5vtBQ4QyCbSbU792npUJ9jzGSCBFYwylaIGM-l43p0I2m7rZtAmx5G7DvMAhGAXRopkXytOciybd0saq9ZUgJWEybEHU990IAPzkKFqeo6w4pQTuBUB85tV8AWSbiYKgvBpPGX9FgS9J8PGD9SE6fIsGn1RUSDkJ3qBEL-AEXlHi6zJsr-flyITPCAfaiq4TYBh2cwmdGE-uVIRApJAtyGfVGDJ-cgwALgXJBQWfMBk0iuD-mTpCUY9cXuqRQxBwIqrVVSARnjnybi5Fp0nwAvAID2hjpLJzYt16gYbslcK8yEq7bnW0L4pDjniEYIJgsouLfbki68y12c_ZsP_77aeL9rtKshWCiWsml8140vaY_yj88z33w1M9-ab04VmeavW9zYBvQh6qnh4wyTzBfCF5bz-gcrXytyEoGq7Wnkc3XIVaMyrYauP7ftCLB-BKk8cYDbjgnPirtRSaayV9KUNBkaA6VXFCkuR7SjK768V5XupByAIhe4kjnlev_wBVu0UA7kvADhy-vyp3ORI0ifMi7yIUcZHoQfdmol74POiVNhm8sIkuaDP0Dza70-sCwaTi6CpERfPfAAAA__92pZjc">