<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/79257>79257</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [Flang] TSVC s314: needs fast-math flags in function attributes for vectorization
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            performance,
            flang:ir
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          yus3710-fj
      </td>
    </tr>
</table>

<pre>
    Flang can't vectorize the loop in `s314` of [TSVC](https://www.netlib.org/benchmark/vectors) while Clang can vectorize the loop written in C.

```fortran
! Fortran version
      do 1 nl = 1,ntimes
      x = a(1)
      do 10 i = 2,n
        if(a(i) .gt. x) x = a(i)
   10 continue
      call dummy(ld,n,a,b,c,d,e,aa,bb,cc,x)
   1  continue
```

```c
// C version
for (int nl = 0; nl < ntimes; nl++) {
  x = a[0];
  for (int i = 1; i < n; i++) {
    if (a[i] > x) {
      x = a[i];
    }
 }
  dummy(a, b, c, d, e, aa, bb, cc, x);
}
```

```console
$ flang-new -v -Ofast s314.f -S -Rpass=vector
flang-new version 18.0.0 (https://github.com/llvm/llvm-project.git 2759e47067ea286f6302adcfe93b653cfaf6f2eb)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /path/to/install/bin
Found candidate GCC installation: /path/to/lib/gcc/aarch64-unknown-linux-gnu/11.2.0
Selected GCC installation: /path/to/lib/gcc/aarch64-unknown-linux-gnu/11.2.0
Candidate multilib: .;@m64
Selected multilib: .;@m64
 "/path/to/install/bin/flang-new" -fc1 -triple aarch64-unknown-linux-gnu -S -fcolor-diagnostics -mrelocation-model pic -pic-level 2 -pic-is-pie -ffast-math -target-cpu generic -target-feature +neon -target-feature +v8a -fstack-arrays -fversion-loops-for-stride -Rpass=vector -O3 -o s314.s -x f95-cpp-input s314.f
$ clang -Ofast s314.c -S -Rpass=vector
/path/to/s314.c:17:3: remark: vectorized loop (vectorization width: 4, interleaved count: 2) [-Rpass=loop-vectorize]
   17 | for (int i = 0; i < LEN; i++) {
      | ^
```

I thought fast-math flags are only needed for `fcmp` (#74263) , but it was insufficient (or wrong).

The following function should return `true` to recognize the loop as the max reduction.

https://github.com/llvm/llvm-project/blob/c41472dbafd0dcacd943a95a9a099c1942d50394/llvm/lib/Analysis/IVDescriptors.cpp#L805-L814

The max/min reduction is assumed to be the code like `select(fcmp(...))`.
This function returns `true` for `FCmpInst`, but not for `SelectInst` which can't have fast-math flags.

Clang generates fast-math flags in function attiributes and that makes the function return `true` for `SelectInst`.

```llvm
define dso_local noundef i32 @s314() local_unnamed_addr #0 {
  :
}
:
attributes #0 = { nounwind uwtable "approx-func-fp-math"="true" "denormal-fp-math"="preserve-sign,preserve-sign" "min-legal-vector-width"="0" "no-infs-fp-math"="true" "no-nans-fp-math"="true" "no-signed-zeros-fp-math"="true" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" "unsafe-fp-math"="true" }
```
`"no-nans-fp-math"="true"` and `"no-signed-zeros-fp-math"="true"` are necessary for vectorization.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy0V09v2zoS_zTMZUBBoiRbOuQQ2_WiQLEP2BbvWtDkyOYrRQokFTvv0y9IyXbsJM2-wxZGQ80Mh5zf_CX3Xu0N4iOpV6TePPAxHKx7fBl9uSxy2v31sLPy5XGrudmD4IawZYBnFME69TdCOCBoawdQBsgi92VRkUUOtgNSr358_3NN6g1hzSGEwZPyibAtYdvj8ZgZDFrtMuv2hG13aMSh5-4XYdtJtyesheNBaYT1-ej3jj06FQKaePw6I_mG5E_z_4t8-nXWBcfNTGUFbCcCPKPzys4MSP-khQKMBlJuoCBsbYLq0b-WOCUeJ6wpCGvv9-agEp_Fva-ZAKojrIn7VLQs24cMTnF1VaheKyxyENYEZUZ8rUdwrUGOff9CWKNlOoatOWHrHWFrQdg60jDSEjFRI_l0oxzulF_AehdBccYu-g7Wt8B11kG8vAln4HJSrqb1GmYAE4GwVfq1QJar813O5terPEZKeWG80qtmf5SrtFyDSct39UWgISFdrxSpN0DKLxPQN0KvD1a3BwOQ5Wb-uK4uoEdYIcIKEVaIcEPEG_jEmViJlzA_K76o-gRra7zVZ7ewCroY_dTgEegz0D867gPELMs6oN-B_mfg3pNyM6XG7JLLltlTUDRZnuXwJhH3KhzGXSZsT9hW6-fzHzo4-xeKkO1VALasW6yW-WKJnDWLblHmjEvRYVvuFnUpOt4tOoa7S4T94G6PgZRPwLkTh0VFR_PL2KOhWpnxRPdmnAUPDrmE3krUUXywXp0m1lfjA9ca5Ua5yCJsO_BwIGwbLGFbNbFj6VBzIG7taGQsE1JJHhD-tV7DLMZDDNe3WrTaRRCit7YfX5VtiyJj2eyv76hRBJT_L_3riwH9qIOKKsonyGIcVXm_qO5u8TshIIz9Fje2vYQKYQxoJwqgwalB48euS2HXCauto1LxvbE-KOGB9g61FQkLmlwKgxJAByWoxmfUwKYP5emgEGgXY5n2PByAhhQyVAwj7NGgixtnWoc8jA6BsJVBa96jPzccaOcDF78od46_eKDdHPs0tglPO-uoD05JvM8ZoH-UQO2UVB7oCbq2pmIYqDLDeE62a0KK1I5eZ6L4MBPv0J-kSflULEn5VEafOUxtr3y6Njc5dTbCmjMpYQpHJcMhSlaxuCgT0GnkzyhB2NGkfGOp0NWry2WiJnrRHAvdpQcsgSzXb8tsfi2z3778-3eFFpICUn_5TV37CuFgx_0hwNXZneZ7D9whWKNfwCBKlNNFFnkn-iEOEIQ1hJXLii3KdHAsrWMAFeDIfcy7seuUUGhCFLUOjs6aPWHtzRDw44DQWa3tUZk9dKMRCUl_sKOW4DCMLo0twY0YTw0WHAq7NzczBvdp3fMTOJRj0nFzzD8rqjH5tI2VQVRFtWRyxzuZS8GFbKuStzVved62omgrJuu8bKtXelJFeTJcv3jlCdt-_XODXjg1xJkpE8NAWPmtyWv6rSmqeyh6fiJs2ytztQOUB-792KOM1u8ms4WVCFr9wjTTpVpDWJN8w5osy2KpZy1Z5NlZufJXeCdc_WtgZ-9u1_0QK3uMktmjxoYzdypqMz8Of-JwmTgP_BnvY-jGB9OUmIoHD-jfxJsy1wvyEJRTuzHKcSMhHHiAnv_CydF3hrxjx81N3x87k8MSSWKnDIL09mesjxpM7FTYgSoZkGqamWO4t5D4P0djeI_yJ5cyJmeZv867GGZ3A8WZwkM4WzXtKjdxZzrvqIyE8Rj4TseSyfgwOHui0VTaDQmo2C3KDWEsGctYFJNorOu5fiMzOPTonpHG1wNh67vvtLlXhmrccz1XIDpVsLOKfBYzlirT-d9dw1hquPlMJD1kJP0bnf1MNDg-DMrsP5SZmsngbJiuvhu7Dh31sYqexZtZ9tq8LqxTs6CL6pY_Nyx_ESJsJXr7TNg6rk7NtOhO3k2rvj9NC-_xsmDT6tQsz9pHgzdnzw10Zo_G8w4_huOjsTRm6Ge4x4yI-XMR_l88kDY5BIMCvefuJSXVTavLHuRjKduy5Q_4WCzzRb6s2mb5cHis67atijLvFmVTSizyNu_kUrZCNou2yBcP6pHlrMoLVuV1VebLrGp51RUNk6zNm7KuSJVjz5XOYoLG5-eD8n7Ex2XL6uWD5jvUPr2FGRvQdTH6jUj3jrizNDCR8km5SKo3D-4x1fbduPekyrXywV9VBxV0elmn53N8jsQncZoaYruOre_TSnVO6TcoPYxOP_7j1pOsjZ0jGfzfAAAA___0lce8">