<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/80494>80494</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Possible Wrong Code Generation With AVX512 + LTO + O3
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          Disservin
      </td>
    </tr>
</table>

<pre>
    Hi,

in advance, I'm sorry that I can't provide you with a smaller reproduction as of this moment, it's hard for us to diagnose why this behaves the way it does.

Following I will explain the issue that we, ran into here https://github.com/official-stockfish/Stockfish/issues/4450.  
We later merged a temporary workaround [here](https://github.com/official-stockfish/Stockfish/pull/4830) but ultimately we believe that we've come across a compiler bug.

Prerequisites:

- AVX512 CPU or [Intel SDE](https://www.intel.com/content/www/us/en/developer/articles/tool/software-development-emulator.html)

- clone `https://github.com/Disservin/Stockfish.git` and git checkout `minimal-repo`

Reproduction: 
1. `cd src && make -j build ARCH=x86-64-avx512 COMP=clang CXX=clang++-18 EXTRACXXFLAGS="-g3 -fno-omit-frame-pointer"` (or clang 17)

2. Run 
`./stockfish`
or when using SDE
`sde -spr -- ./stockfish` 

```
➜  src git:(minimal-repo) ✗ sde -spr -- ./stockfish 
info string Found 1 tablebases
[1]    708806 segmentation fault (core dumped)  sde -spr -- ./stockfish
```

--- 

We are currently under the impression that this might be a compiler bug in clang.

What we have tested so far:

- does not crash with `-O1`
- does not crash with `debug=yes` or `optimize=no`
- does not crash if LTO (link-time optimization) is disabled.
- does not crash when compiled with gcc 12.2.0 (LTO enabled).
- does not reproduce under most sanitizers (excluding `-fsanitize=nullability-assign`)

Only the architectures below are problematic; others do not crash.
    x86-64-vnni512
    x86-64-avx512

What we have so far diagnosed:

```diff
diff --git a/src/syzygy/tbprobe.cpp b/src/syzygy/tbprobe.cpp
index ad15e751..95aefbfe 100644
--- a/src/syzygy/tbprobe.cpp
+++ b/src/syzygy/tbprobe.cpp
@@ -730,8 +730,8 @@ Ret do_probe_table(const Position& pos, T* entry, WDLScore wdl, ProbeState* resu
 
 
     // THIS EXITS??!!
-    // if (squares[0] < 0)
-    //     _exit(20);
+    if (squares[0] < 0)
+        _exit(20);
 
     d = entry->get(stm, tbFile);
```

- __The exit here is triggered, `squares[0]` seems to have a garbage value at this point, however
we are not sure at all why this would be the case. The do while loop should be executed and have set it to at least some positive non-garbage value.__

- Running sde with the align checker reported, the following, though I'm not sure if the two things are related. 
```
 TID: 0 executed instruction with an unaligned memory reference to address 0x7ffe90e6adad INSTR: 0x7f6ade5923dc: IFORM: VMOVDQU64_YMMu64_MASKmskw_MEMu64_AVX512 :: vmovdqu64 ymm17, ymmword ptr [rdi]
    IMAGE:    /lib/x86_64-linux-gnu/libc.so.6
    FUNCTION: __strrchr_evex
    FUNCTION ADDR: 0x7f6ade5923c0
# $eof
```
- Running sde with `sde -spr -null_check 1 -ptr-check 1 -- ./stockfish`, returned
```
➜  src git:(minimal-repo) ✗ sde -spr -null_check 1 -ptr-check 1 -- ./stockfish 
info string Found 1 tablebases
SDE ERROR: DEREFERENCING BAD MEMORY POINTER PC=0x559415e9f021 MEMEA=0x55931c9ec8f4 mov eax, dword ptr [r8+rax*4]
Image: /home/max/Documents/Github/Stockfish/src/stockfish+0x6021 (in multi-region image, region# 1)
Function: main
```

- The crash happens when using `squares[0]` as an index
```
Program received signal SIGSEGV, Segmentation fault.
0x000055b3c3aca021 in Stockfish::(anonymous namespace)::do_probe_table<Stockfish::(anonymous namespace)::TBTable<(Stockfish::(anonymous namespace)::TBType)0>, Stockfish::Tablebases::WDLScore> (pos=..., 
    wdl=Stockfish::Tablebases::WDLDraw, entry=<optimized out>, result=<optimized out>) at syzygy/tbprobe.cpp:825
825                 idx = (MapA1D1D4[squares[0]] * 63 + (squares[1] - adjust1)) * 62 + squares[2] - adjust2;
```

- `entry->hasPawns` should be `false` and is also false by looking through the debugger (gdb)
- Making random changes to the body of the following if statement also fix it `if (entry->hasPawns)` (i.e. comment out something). This branch is not taken.
- Initializing ` Square     squares[TBPIECES]` to some value, i.e. `= {};` seems to fix it, however `squares` shouldn't need this initialization since it's not accessed in a non safe way, this also seems like a random changes, similar to the one mentioned earlier 
- The mentioned `max_element` function in the issue, is probably completely unrelated because this branch isn't taken, but perhaps this branch is causing weird optimizations 
- The mentioned repo from above is a reduced version of the official one, our master branch currently has a workaround by disabling the optimizations for that function, https://github.com/official-stockfish/Stockfish/blob/master/src/syzygy/tbprobe.cpp#L711
- We have a somewhat smaller reproduction on godbolt, however this (unfortunately) still includes a null pointer dereference, which our code does not have... though maybe it is of help? https://godbolt.org/z/MqxcW671j 
- With the trimmed down repo, I was able to reproduce it with clang 16 too after removing `-fno-omit-frame-pointer`, the original issue mentioned clang 15 as well though.

I will continue with trying to come up with a smaller reproduction, if this is too vague for you. 
Any help is much appreciated :D

<details>
<summary>
  You can also reproduce this on the master repository if you want, by doing the following:</summary>

- Get syzygy tablebases
  ```
  wget -r -nH --cut-dirs=2 --no-parent --reject="index.html*" -e robots=off https://tablebase.lichess.ovh/tables/standard/3-4-5/
  ``` 
- clone https://github.com/official-stockfish/Stockfish

Reproduction: 
1. Remove current workaround `CLANG_AVX512_BUG_FIX` from `do_probe_table` inside `src/syzygy/tbprobe.cpp` L711.
2. `cd src && make -j build ARCH=x86-64-avx512 COMP=clang CXX=clang++-17 EXTRACXXFLAGS="-g3"` (or clang 15-18)

3. Create a text file with this content, replace `PATH` with the syzygy tablebases directory, which you got earlier
```
setoption name SyzygyPath value PATH
position fen 8/8/3K4/1r6/8/8/4k3/2R5 b - - 0 18
go
ucinewgame
```
4. Run 
`./stockfish < input`
or when using SDE
`sde -spr -- ./stockfish < input` 


```
➜  src git:(master) ✗ ./stockfish < input 
Stockfish dev-20240126-fcbb02ff by the Stockfish developers (see AUTHORS file)
info string Found 145 tablebases
info string NNUE evaluation using nn-baff1ede1f90.nnue
info string NNUE evaluation using nn-baff1edbea57.nnue
[1]    291189 segmentation fault (core dumped) ./stockfish < input
```

</details>
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy0ellz2ziz9q9BbrqookCJki58odVxvePYn-1MMlcukGySGJOABgC15Nd_1SC1ecmbnDnH5Vgi0QAavT7diLBWFgrxig1nbLj4JBpXanO1kNai2Uj1KdHZ_uqzZHzOwgULp-1fqUBkG6FSZHwON4yParDamD24Uji4gVQoxkcO1kZvZIaw1w1spStBgK1FVaEBg2ujsyZ1UisQFnQOrpQWal2jcrSudIyPLJTCZJBrA40FpyGTolDaImzLfTsjwVJs0IIrEbZiD9JBptH2zjle6arSW6kKuIGtrCrA3boSUvlJ0toGW9a3_kRGKJDKaSjRIJTOrS2LpoyvGF8V0pVN0kt1zfhK57lMpagC63T6kktbMr56PPvul7aMrwaDYdgDaLn5hlAJhwZqNAVmIMBhvdZGmD1stXkRRjcqAzacEQNsuGB8_G-4WDdVRTyMo5DxCSSNg6ZyshYOqz1sERKsJG7OZDDaIKS6RhCp0daCoKe1JMUlTXEh2nuDBv9ppJUOPYNnYwFM__w-7HOY338FbehEN8phBY-L5XvH2m63PUkE3clSrZy3BhphfNWQKFExvspwg5Veo2F8JYyTaeXF7LSmk1qdu60wGHRkZFIB1k0lnDa90tUV45NLRtNKKwQWhz8R9NEvzuXbK6RjcQhCZVBIB2mJ6YtuHK1VSyVrUQUG15rF4fmOD2f2z6JpZxn9Hk1LM7AmBcZjxmOoxQtC8DckjawymD7MP7NosRvHQTwIxGbnxXt3e8-iRVoJVcD8-_fDd8ZnjM-C_hiW358epvPv31d_TK8fWbRgnAdFBEGudKBr6YLciBqDtSbxG8Y5HYnxsTbQrtofvRIZ78FDozq-WRz2SO5HozscVhvYlqigseR8pPYDvc0QArs2EATwei6cb0Rrtb_t45KzyZJN5uCFRNInZY0vZM0n4OnmbDKCD3eCQzjLNVhniMOV97w-OJFUmAiLttt1OOuz4QIAYBSOx2EMFgsyK-EDWC6aypG8Um0QsqZeY0ZMfLz3-2drjTEILgTwDUEYhLQxBpWr9tCoDE0buuq1QWuJBe-7bQiVRekgwVdeC1K1urxw32-tywPFUHBoHWZgNeTCvPFliqqgtIPUCFu2AZ3FYXDXPx7gQ6IMk6Zg0WKPlvRLsSAO9drJWv5AFi2U_ngNmcMfT3ck3Uqql8DJGqGb6qVPgpYWMmlJaVnvI1bIDjuBZC1jRZpCn_d4z9s6bYLKr8H45O0yh5SFnQJqbR1YoaSTP9BYWgJ3adVkZEgkmPwwSAdsqkokspJuH7Rplw586VJ3qtp7tQqTltJh6hqDlOAqvfUmsDY6qbAWTqYsmoF2Je2b6dMpO67JTrsQsVFKDvv8zfs2dHxoCq0NHNNt9soajoabyTxvX9E3CAKKgYIs3aT0d_9jX-wpMifEPPbS9RqSnw4fnDLDHYisP8TRsN_rTYYC8yRH6IdhPBicPOXne3Xc-jjI-OyXtmaDkA1CCEaULudjYHx2_NoOPSBhjGc_79mHCu_6yjq411a2RhnDWluCE0-MTwGVM3t6-rb449FHiW1W0fM9LfLohEMiM2ibTlWXH_TTpiR4-nzzCMvvN0-PLFrRL-_Tb2euJzqZk0nafxph0LLhLKQAxqI5nOzunJx-nnFHsGvMPUk0O4qPBn9hvY7y46XOj5MBixatYAIWLQskcutqkopLVpKkesbEu6ESnp-fSgTaq0Vr0oIzsijQkBfPyQ8vOabwYxFrjya9qQsohElEgbARVYNwiKM-GdIapd7iBk275baNxuRxtjGeWlTVCY5udVNlFH3JkVNhsQfEYaZhW8oKodJ6DbY8UOEO04aCLuGH1vPQEYZ1mpauUFCQISi29pa1oa1VcMFx7_n5UigPjVIUhCj7-EDng0olC9XCkxZ_a-NaGdFofkDI7QvdFGWH7I8nlbmndFtNJ1WF9YIwSFg268G7aoKnmwXBm_B0UKmsMx3wb4sCBY3y3GEGNdba7MFgjgZVil4OWUZZDsLdKM9xEmIsMpHBzZfHpwe_-G6UxyLD4YRHWUpvblZ3D7f05c_buz8X_-9rPHj-6_a2iQfPt9PH_9T2Zft8u_TPHUCl8BZNYVPrTfZPEw9gX9cEeeb0ZatNBmvn8avJJBnR0YpvbqfXS5raOlIlKcLsxvFzPAgqqZpdUKimHUh7Vvfi09TV1y_zp5u7LzT7-dk6Y9LSPOMGd29pYLpYvDlrevAFHgHjA9T5uzp4xx4u8BelpmdvF9CHYO1McHx4C818fYSuMQqz_3V89uuc_AZ0e1wsYfnwcOfFt1g-LFfLh-WX-c2Xa5hNF3C7vL17-Avu726-PC0f4H7OokW4Gw4ng_4QJ3nI-0SynB5eR_10guk4H0CtN4BiRxLJLkxkzPjM0MB0cLSVm1oU6IE-X5W6RsZXNZGsFjptCEhS-XLtq41X1VuXrk5vZuEuJrYYH0sFNZVygcGC_En6XbyKCp-EIugfg_OqUcdyoxZS_TSuUsxqcVMp1mtU9hzHvxtUhQVfN2cH-3218r3RhRE1GExRbghmykKJCh5vrh-X138S149vMHWHZ8JdGIbhcJhEaSRSQaeXCk5S8t7L-Fgorfa1biwoUaNdi7TNITT8KmFH89-b_jR76uYxPv7tqfs1PYYsWvpzXk5_Olmsfz5ABBYtSckEI6JFr9fz-ewYGgg_RIv_vtTCiC3NbBFItGDR_AC8M9CN63gi5FG5j8YnlIzehUzRdMyHLVdjPoTXPzLb-TTP-PhWrKf9RX8xYMPZK_MhIMGnEFMcm12iDF92BSCyvxvrvDFT8CBi7olPlPyckv834MDi8Ag9SmHvxVb50uSUmlkc5qKyeKjupQVReVxcWYRkT6n8hdzBlcbnS8qOvtAp0NAhiiw5Q1q3whMboTJdQ1oKVaAHITQt0dm-7YGdpWLKuJaQITlFt7fcEThgcdiisbdH4JOudJc97FG94yfrpkURPnFTdQNPvndmhEpLOhlleSdeUB3rnhuqXUQlf3QeD49e0l6rJ6E_ze5vlvPlYxcEnG7BisclvotHXJDwyQZGMzZakGLOMVh7pDOcdR5ejgppG4oKMWthljxw1wYLKwkrdC1DOotIU7TWww0QBJnAitw3CFuAc1Bmy0YlXwgHXuqGCK2sZSXMQU1aIZA8pSasgsJUkvg9i5mnURaHtdg9Y-XVRyfJuwAM541HLyTrSzuRVHtfoVboO3ON6sAVJJiKxmLX8DworZWJ1xqtkjQO1mhKsbavCIFmkxq3KE12UTzbD7inNA250TWIRG88rhZgkMrfDDZofM-hM9hDA5LEQ4zoxkAtrENzYOHUuygpTZx3OZN9V7m3noSvuMu1aVsbB-F5Q_kXzdCk0olPvtY3un5WDfLoj1H_WFh9w0O5QBa-JZ7ebWZrBYXOEl1d2LRXCOPjRuXauEb53iuFMutkVYFUadVkSLIhEARdIw4yPCJhWm1byrT08k11hqfWBDHW6_UOwL0W-4S8gbSmcyixWrNo9VpsLZM9bQrGVz8YX93-s0u_xaP-30eb-HYoHpyRdY0ZZHqroIVwc7iBLWkzqTxKP7VHpGtRZtc5jMFpDSJ3XlK13hz7I-83H1uQ6U3ByEISRmhb9Cfr7FYeEujYYlV1B7_obHWd_lQrJ1VzKITM3tuZbtvbzfpn9xLeNbt7CaortYaNKBr0RrnXzaHmmaq9lzER1U1aglivDabSuy6LpouLzkk0z9AJWVlKrIdXtqlrQel5eUjwf-kGUqHaKHWSrWdGt_Gj8zFSh5WOyiaZt3ctoi1cybf0wa9OFR4hgzkZ_atND1q_xkOufwOmAV4XeLAt0EFA6P0zBEHauCCThhALhyBQOlgLcn0IAoN_Y-ra5rOHiV0nfso4hwDB6EQ7mqjz_JWtHtnoVTIt0dqe3pSH99bDY6EyYTLGV1EwCIY07TXDcNnr_58HkV9p4z-QoR-bthfXOnE4_2P65borPZ9nX6-fVzfffYKgeMvi8BVUjUMqmmXmIclP4lUcAgWs3rFB_39xlzD64C7hvTuDYdAfv-pxRj2YGxQO_a3XzkEuq2OTgjLV4cqHEOm6Eqk_9P306TOtfmxmvDFPyKTBlJzgFCfJFQrtDnn6XTho0VHC0crjdnj0694LV3bNIL-zp1x3nT3IUcGY8RX9i_4zYHzVN3H3TP8GLxHjK_4whAQCCCCE_rhdotDtZ5NKhdtC1PguT4OfXa34jptU68b9m0uWi1Uur1t-q6jvsuh5Of_BPt0mRy-CDDcBD_kg7PM4yNMkCXmeU8gi_V6QtTd9Pn1aRJh-ffp89_DoLedoXe90AgbDN-HrnOrLl69LQNJyCyFb6SkVJCLP-5hhP5-EPaUa_P25CYrh6Gzu2fURn_T748mvXR99pLKP65o2sJ8SzKfsKsom0UR8wqv-KIwn4SiKB5_Kq2w4EGI0TOM0yfpJLvr9iA8no0GMA8GHafJJXnnl8JDzcBiHg55IxpM8iyIcTwZJfxSxQYi1kFWvqjY1oYhPPktfjcPBZPCpEglW9vAfC8wVEQVJU1g2CCtpnT1Nc9JVeHWvrZUEJb4ZTZGH8M01KjStiDwSOTTr-Ky7EZrBXfSpMdXVT4I57dN9BGujfQ46u5n37P7_AAAA__8NgE2S">