<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/74263>74263</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [Flang] TSVC s314: `fcmp` doesn't have fast-math flags
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            performance,
            flang:ir
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          yus3710-fj
      </td>
    </tr>
</table>

<pre>
    Flang can't vectorize the loop in `s314` of [TSVC](https://www.netlib.org/benchmark/vectors) while Clang can vectorize the loop written in C.

```fortran
! Fortran version
      do 1 nl = 1,ntimes
      x = a(1)
      do 10 i = 2,n
        if(a(i) .gt. x) x = a(i)
   10 continue
      call dummy(ld,n,a,b,c,d,e,aa,bb,cc,x)
   1  continue
```

```c
// C version
for (int nl = 0; nl < ntimes; nl++) {
  x = a[0];
  for (int i = 1; i < n; i++) {
    if (a[i] > x) {
      x = a[i];
    }
 }
  dummy(a, b, c, d, e, aa, bb, cc, x);
}
```

```console
$ flang-new -v -Ofast s314.f -S -Rpass=vector
flang-new version 18.0.0 (https://github.com/llvm/llvm-project.git 1c1227846425883a3d39ff56700660236a97152c)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /path/to/install/bin
Found candidate GCC installation: /path/to/lib/gcc/aarch64-unknown-linux-gnu/11.2.0
Selected GCC installation: /path/to/lib/gcc/aarch64-unknown-linux-gnu/11.2.0
Candidate multilib: .;@m64
Selected multilib: .;@m64
 "/path/to/install/bin/flang-new" -fc1 -triple aarch64-unknown-linux-gnu -S -fcolor-diagnostics -mrelocation-model pic -pic-level 2 -pic-is-pie -ffast-math -target-cpu generic -target-feature +neon -target-feature +v8a -fstack-arrays -fversion-loops-for-stride -Rpass=vector -O3 -o s314.s -x f95-cpp-input s314.f
$ clang -Ofast s314.c -Rpass=vector
/path/to/s314.c:17:3: remark: vectorized loop (vectorization width: 4, interleaved count: 2) [-Rpass=loop-vectorize]
   17 | for (int i = 0; i < LEN; i++) {
      | ^
```

`fcmp` should have fast-math flags to be recognized as max reduction but Flang doesn't support.

```
.lr.ph: ; preds = %.lr.ph.preheader, %26
  %indvars.iv = phi i64 [ 2, %.lr.ph.preheader ], [ %indvars.iv.next, %26 ]
  %22 = phi float [ %18, %.lr.ph.preheader ], [ %27, %26 ]
  %gep = getelementptr float, ptr getelementptr ([1000 x float], ptr @_QMmodEa, i64 -1, i64 999), i64 %indvars.iv, !dbg !21
  %23 = load float, ptr %gep, align 4, !dbg !21, !tbaa !15
  %24 = fcmp ogt float %23, %22, !dbg !21
  br i1 %24, label %25, label %26, !dbg !21

25: ; preds = %.lr.ph
  store float %23, ptr %12, align 4, !dbg !23, !tbaa !15
  br label %26, !dbg !21

26: ; preds = %25, %.lr.ph
  %27 = phi float [ %23, %25 ], [ %22, %.lr.ph ]
  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1, !dbg !24
  %exitcond.not = icmp eq i64 %indvars.iv, %21, !dbg !21
  br i1 %exitcond.not, label %._crit_edge, label %.lr.ph, !dbg !21
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy0V12P66wR_jXkZoRlj7-Si1xskpOqUttX7XnV2yNisMN7CLiAk2x_fQWO87XZ7elFV6vYzMDDzDPDDGbOyU4LsSTlipSbGRv83tjl--DyOktp-8dsZ_j7cquY7qBhmmDt4Sgab6z8twC_F6CM6UFqIFXq8qwgVQqmBVKufv_-zzUpNwTne-97R_I3gluC29PplGjhldwlxnYEtzuhm_2B2Z8EtyO2I7iA014qAetp61fbnqz0Xuiw_Toh6Yakb5ffKh3_W2O9ZfoixQy2owCOwjppLgqIf9xABloByTeQEVxrLw_C3c84Rx0jOM8ILp7XpiCjHsPaeyWAbAnOwzoZPEs6n8A5vN0A5T1glkJjtJd6EPc4DVMK-HA4vBOcKx63wTUjuN4RXDcE10EmgiwKozSIzw_g8AR-Jeslg83EXYgdrB-Ja42FYLz2E3EpyVfj-xouBEYBwVX8XwCpV5Mtk_vlKg2Zkl8Vd7jyEo98FV_XoOPrS7xANESmy5Uk5QZI_m0k-mHS_cbycWMAUm8ug9vblfRAKwRaIdAKgW4IfAMbNaMq6iLnE_AV6r9wbbQzagoLFtCG7KdanIAegf7WMuchnLKkBfod6D965hzJN-PRuITkuuQSKcjmSZqk8OEgdtLvh13SmAPBrVLH6UF7a_4QjU866SFrMsR6XlQFlvN5znKeL9q2rOo0raoU84ot6qzE5pphvzPbCU_yN2DMNvuqoIP-qc1JUyX1cKadHi4T91YwDgfDhQrTe-PkeVT9WTvPlBJ8I21QEdz2zO8Jbr0huJWjOpQOeUnErRk0D2WCS868gD-t13CZxnxI148oSu4CCSFa289NxW2WJZhc4vVdKNF4wf9f-OurA4dBeRkg8jdIQh4V6aEqnqz4ahIQxC95w-01VQgi0LbJgHoreyU-D11Mu7YxyljKJeu0cV42DujBCmWayAWNIYVeNkB72VAljkIBjgPpaC8F0DbkMj0wvwfqY8rQph-gE1rYsPAiawXzgxVAcKWF0a_kxzkD2jrPmp-UWcveHdD2kvs0tAlHW2Op81Zy8XxmgP6WAzXjoXJAz9AuStr0PZW6H6bDdjuQTWxH9yexeX0Mn6gfp5L8LatJ_paHgFkRe17-dutsfGxrBOeTKBIKJ8n9PswsQmWR2gurBDsKDo0ZdDxsGKtcuboaE5DoFTlUuWsDqIHU6481Nr3V2L98-9tXVRYiACm_fV3U2ubQh_uA25tBcdizo4Bb2FvFOgfewE6AFY3pdGSAOTiwM1jBhyY6vxs8jBcQboQbryBu6Htj_euePw4TZZM-khY86a3gLnpJsBxVSW_FXjAubCCVYInV5CHBUmp-ZNYl8hhX9XsJsioCw7G_v4KBeNtZxzkPCIkWZ3_dBO5iEQR43aBVhvlpeTb_tW2w_gy5E32E7oQXShyE9r234yZhSRg8qgjOSbnK0jSF82XeuFVUFumPv__1YPi32OoCGTSb3haLRWgAl9GD86N1Gd914YHZvet5tE8Zxh_NGo2PfVXJTo9p_wAyjv2OsfDMynvUIqKG5APT-YnVsN3EFH5m1M6CzEaQMEWxnVBxWD4Oq9cA4y-WX2TdtJPzxopn2y6-Z_i56_mnru_sLxtYvTRwdPKDpTHHXufojdHyOSsfD8lTZj4djfEyxjno4QTanV4nUfbkUnEHKM7SN0bzRJsRTYboi399lo4lPqN9TIF7zIfoJz8aK_0PwTvxKB9pe038VJxmfJnzRb5gM7HM6jSrikWe4Wy_XMx5ni_qqqyaAuecY13nxaKsF5wjy1k5k0tMMc8wLdIUsxKTlleCCVEj4mIuCkaKVByYVEm4xIVvq5l0bhDLusAqn0UzXfzQQ-yFbY09MN2IeE1YE8R4GyD5m7RBVG5mdhlvg7uhc6RIlXTe3aC99Cp-NsbSHO7a4XsvtsSYXLfqf6varxrAbLBq-b_dS8NNJjjmCG6jb_8JAAD__xXELVU">