<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=http://email.email.llvm.org/c/eJzFWtty28gR_RrqZYosALw_6EGSFa-qvF5HdGInL6wBMCCxAjDMYCCS_vqc7gFAgJJ3HYtJWJREEDN979PdA4U6Pl5_UWIQLD6khTR6ECzFVj4rUSpVCLtNS2FVaUUi00zoKJJlqguZZUehC6ErI0JtsWQrrZBGCVMVRVps6KYU5RZfxSKX0TYt1GjgvRt4N-73gxWgHFXGqMKCWJyWMsywONFG3NyYaDubkFBba3flYHwzCP6C9ya12yocRTrHRZY9N3-GO6N_V5HFJe7lKX1QyST2fX8ZRd54Fs5lssBff7EYz5VaLOQC3_gL6QeksCziyzCdBdF4nshARv4yVpFM5v5kOYkXyvPU2PfVNFbTYLmYEtO4UsJqNnHfNINgnmWitNJYsQdzsSfrPtDK4gm_lQBnGCuHCQfBnYiVhXNgO6Nkqdn8mbTKjMQHLWO6pD2RNirBMhDaaEvaOZ4zr37z5SejN0bm8LnJEQ8WVFmEMt3A62L18H51__7vxHWlNjl8Jy3iAdFRZbbRIhh7QngHr31N_OV8MhFpIdbrg5FHYj6-ua2SRJm_VqpS7gujMiig3Pdwwx8ublbNyJRYjD81--ntnQsrqA2LxBRoPi357C69Q4LX0p8mCWTjyP_ySUyDxWw2Jip4T9-9ahz3-3PPAS5FIh0r0o_zIKmKiKzyPRsLeCx6VMmdrgoEzQKqjO4PFhKXtzJ6gsNWFs4iIz_V36_Sb3xNS-tty8H4tqEnrc7TaF26bYvb4WB8TwZqyNLWek2GkOjzpJu5yrU5rrWJlVnL6F9VSoRgi7uGRe_VW167rSvR63bTIj-KqsAexHbBgdkxBGxYZTG-AbjoCJHHUVvLKEJnGMEqilDR56pEdIZHNor4TG7IlSxqMNpvAV9HXQlrjpRm4KoQMKqIlOgoL1IrdjqlT1hUFbnc7QiyWMFeWn7ZHp2vY63KAjlqgZNYDKBDtpKsNs3ZSQ8ikm6BOuwymRY9Op-Uwb6yzkmETbllxSk2vx8xZx7muH_Nz8uXrq7XVuer_hc-X-naG4qrQ7mHxRpAepYZEFAnPT-zLyiv70ShLd_iaEA6scmObgWcNRK_6L16Vgb2jrWz9ym4eGfIGAHmcJAqdLXZkpNLeXSISkuIG8htERniDIfvlAGuFtnx5CpZUsZnWj_hd_qkOMBIMcSxUQSHjXIcbBRrEcc2VbVS54ogfMO79pK1EDKKVFlSvaRthTpYUEaxFF-2abSlKpyDaQH0Nqg10DCj5bYWmIk6cQFJO0h8FFQ_cpGnZUm8eOGeaXWyA-LVku1TGKdgO4ZKfFMGIlkX6GzpnkmGw2H38lfNqlP1gYZI233fgo8KQsVVxEaB0dASQJKMHGNdGcI3VJcEF6bE6JzyEcIUtqIeg9IaUWOVK8d1ebkRK66NtLcgF0H2phmp0ixGR9L0HKctt3SHiupGoRog6qiEc-FHyyIQtqibv4PiVkVPQ647bSRj93sjQ5fj1Avt6iJJuyO9O5LQ5Ig9G5sThZqZ0_YvJgWeoSGKTLqzLnSImjMazAPtqBAh5JVzkD_zXHFBTFR5SBombKo2VBoNkSmldnCXQQ5ow6AGXIwVl2WKoZIJ7rV5Qn6cSuVZygaofz66mDAt6Lcst11vlsqKoep-Q1HNALMQgAH34woo3451CySdldFg_M6jpfhwh88QzF0Ft3j3CGBjhwZekPDr480_1r99-vzw28cVdg-CAKEb6jK1RyKGazGC8Els1mmR2jU5bKQFgWCH0J8tAaCo9mIv0ds1F2niYmCYYEMQMRwHpAB8UvSAU0VbIhqsEBluT2RgUgVM9tE9qVPkj4hEb-sBHvRP1kvbjw1Rtt1Pv-8v9m4EPxnsRVDdiA8puuMaReuE5XI7qSMaYcGAk2Nd5IIeoUoGQ41B0igMJqnmIJIFwDBOn9MYAEGzxymYweixcrlR5xlvILwG_kjOCiufFLNmVxBjB7CckYP5vR8IVGNIFytU9rieZIikJsh1DOclUNtCBgJb15WfpKhbRKQwwVuueghHgPEAsaq6QLX5j8WuchhXpXCPJqKyVHlIMPggKFJTnrvsltZwTd0T3lA5IghhzjSGwURlidLKjTuRcv1AybArMchF1pVSqlxNlXVbCMBGZBZXROgmyKGNQWVxlgCDsgrd3HOmc6ckQVxUsZNiNA6S7kz1pDXboO6WSzcMqdo1L4vPi8Ci4Uw86zR-2U_3Z4U7RJ3R2S06iScod3PHoImWem1F01nXl01zjWC47fFB3oxGIwr57reMB2BZd1yJstF2Dfu4tuuO-rOTYHdI6Vf6LmqmmCPQbPyO1rTwl8XyYMDksKCNmGwOgYfhRFz21XCD3KVoudU8Mcwd_MuxsBkrtF92WFxKrZ5nHlxf6eaFPbKfYrmsqNtKKrQ8T0rtaDagMEypr7JoYlI6leiR6R8FxGhNMmqzRtLk9YlArKOqHYTpukq9hUdldIufm9lk-E5aOfxs0HwhLIcPRWlN5UIeC1afP3x9bMwThcU30ZrHO2BwnimPYm_9z48zNw_7fie-_XFvaL5__Ljy1rP6AuV0PA3OQ7Y2zGkacn1gShba46dwTSCVs05fjbmoDtweMaCnyGSoMlpq17yuUSYcFYqUYTXm48kb1JiMZ1CD6j9zDUey6MvxG6HzPi1Vp7OtNSJt3ClJO1h-b-Dumck1lFyBmqOXPaFyXFuDezC2CPdffMtZi4qM66BDOD3ajtohLjY7ssiBGi327zimI4iLxD2hhREd8pxX4zqJx8vF5I0p9oKN1-Hy46TPg3FHHTmhr3MYFf2Im6Q7sUHHmTJw6ta_G011gjatGbLd6tF5BjnhXOh50VsyaOlCr2uBMHPXIO9Fsk1Q31-vS0l1GjHn---V_QQxqcDcP7dZSBj4Jvv9uW3PjdMZndjoq3YQpXMR7foEQ00Jn6q40GaP6ChCM0R171QinOyuRHj_hRJh0fcxC8-ViNfX1ap8ZFDnCbeZWul4hjsdSkndy0hVJyQR5mFOGg4xrFdHXcQ90nzOymcNbqwDOR6nVPxGBfu4OHtLcE6WwXdwEWlauiitXRX4ta9mPyn-K5oUauNYuGr-ZiA7Iy9jZ-puP3IRJj0__6okN6oHz80hLao354F1b9aeG4dzCdBuhMz1sxPShWtwISP8kQ044SbnfnxdKb9RivP5XKWJt5w1C1ocpps_LW5Ri-t3XPZD5Hri_62usO6hkvxB5l1YDqX3OixP-XyxkNk6Z1Pcf3rOW4D_QwadDucMUB-K9gmLiKt8R1gBsaOtO32TCTjyUIUptKxytnCEboThllWth0w6-RMy1M_qRbFtW2V_edFiwRGc0vWzNwriNr4uEMfnKbLnFNl_M28n3SXfmmjZ1tPgZ_uRX2qkf-WhgWzPprsbDosX2LDoD6j9W2J4hiP1dtwp3bNAKkdpcf4ookfyIg_p3FT7_3pW96M-bcN-cam5t6c_PzWK87A5azWqpBNTncDsZtMcUdCDlPoQxc1Fbqqsv6FMLvtmpeeurhGRTN0hAIGj28m3dEEPshJct-0VraVXWm4vomtH4U7f2dr0lQedV_H1OF6Ol_JKVnarzfU7-ZzGq10aPSlrryqTXf_HT8f5aIlm3el4MZ1dba-X8yAKlPSnE288j4NYxrP5chmEiZxF3nTsX_FIWV5DxkEQcGBxSl-9nXl6HXgB3v7UnwZzbzryvdlYqkUwmeC2mvmDiadymWYjojPSZnNlrplkWG1K3MzS0panmwB0JJ1SLCsktKnN1PXXR3kUSWyGRtG53Sja7U7_RVG--DeKG0OCN_-AQP9QccViX7PM_wYu6owD>53856</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            XRay fdr-reinit.cpp test fails occasionally on Arm/AArch64 bots
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            xray
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          DavidSpickett
      </td>
    </tr>
</table>

<pre>
    We (Linaro) have seen this test fail occasionally on our bots that are running on a shared machine.

It is currently disabled for AArch64 (https://github.com/llvm/llvm-project/commit/ef4d1119cc036b7af803618837ee88a87af18a12) and AArch64 (https://github.com/llvm/llvm-project/commit/62c37fa2ac19decaf71494d8e00e311e5de52985) due to this.

I'll start with what I think the problem is, detailed reasoning later. Loading the corefile I got:
```
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000000000419744 in __xray::BufferQueue::releaseBuffer(__xray::BufferQueue::Buffer&) ()
[Current thread is 1 (Thread 0xffff915ff000 (LWP 528663))]
```

The problem is this code in that function:
```
  decRefCount(Buf.ExtentsBackingStore, kExtentsSize, Buf.Count);
  atomic_store(B->Buff.Extents, atomic_load(Buf.Extents, memory_order_acquire),
               memory_order_release);
```

To my understanding decRefCount could deallocate the Extents backing store being used by Buf. This means that when you try to dereference Buf.Extents it points to unmapped memory.

Why this doesn't happen all the time, I can't explain.

Perhaps the code should read:
```
  atomic_store(&(B->Buff.Extents), atomic_load(&(Buf.Extents), memory_order_acquire),
               memory_order_release);
```

So that we are swapping the value of the Extents pointer, not the locations they point to. However I don't understand the buffer well enough to say what the intent here is.

Certainly the code as is looks like you are decrementing the refernce count for something you want to access on the next line. Which seems incorrect unless there is some
property I'm missing here which means that ref count will never be zero at this point.

---

More detail follows.

Reproducing took a while but I got a core file from it eventually. The steps:
* Start a container on our buildbot machine
* Build stage 1 of llvm and run `ninja check-xray`
* Grab the test program and copy it somewhere memorable
* Write a script like the following: (note that 160 is the number of cores on the machine, so when lit runs it by default sees 160 workers)
```
#!/bin/bash

set -e

for (( ; ; ))
do
   for (( c=0; c<=160; c++ ))
   do
      #XRAY_OPTIONS="verbosity=1" ./fdr_init_test.o &
      ./fdr_init_test.o &
   done
   wait
   if test -f "core"; then
       echo "Some test crashed! See core file."
       exit 1
   fi
   echo "<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>"
done
```
* Limit the containers to 4 cores (this mimics the worst case scenario for an individual bot)
* Run the script for as long as it takes to crash (seems like ~12 hours depending on the other bot's activity I think)

That got me a core file and I found the following going through the disassembly. I initially thought we were looking
at an issue with the atomics but afaict they are not the issue here. (I'm not an expert on that subject)

There is some inlining going on and I'm following the functions to the crash point.

```
// void decRefCount(BufferQueue::ControlBlock *C, size_t Size, size_t Count) {
// <...>
//   if (atomic_fetch_sub(&C->RefCount, 1, memory_order_acq_rel) == 1)
  ldaxr   x8, [x20]                                                                               
  subs    x8, x8, #0x1                                                                 
  stlxr   w9, x8, [x20]                                                                
// If the store was not succesfull keep trying it until it is
// https://developer.arm.com/documentation/dui0801/h/A64-Data-Transfer-Instructions/STLXR
  cbnz    w9, 0x4196e0 <_ZN6__xray11BufferQueue13releaseBufferERNS0_6BufferE+352>
// If the reference count is now non zero then don't unmap memory
// see label dont_unmap
  b.ne    0x419734 <_ZN6__xray11BufferQueue13releaseBufferERNS0_6BufferE+436>  // b.any
// Otherwise ref count is now zero, deallocate Buf.ExtentsBackingStore
// In our case I think we did unmap the memory so we didn't take this branch.
  adrp    x23, 0x43d000                                                                
  ldr     x23, [x23, #3984]                                                            
  ldr     x0, [x23]                                        
// If page size is not cached, get it, otherwise go to page_size_cached.
  cbnz    x0, 0x41970c <_ZN6__xray11BufferQueue13releaseBufferERNS0_6BufferE+396>      
  bl      0x40cae0 <_ZN11__sanitizer11GetPageSizeEv>
  str     x0, [x23]                                                                    
page_size_cached:
// Something to do with rounding up the size occurs...
  sub     x8, x0, #0x1                                                                 
  tst     x0, x8                
// Not sure what this does but we don't take the branch, the target is beyond
// the point where we faulted                                                       
  b.ne    0x41976c <_ZN6__xray11BufferQueue13releaseBufferERNS0_6BufferE+492>  // b.any
  lsl     x8, x21, #6                                                                  
  neg     x9, x0                                                                       
  add     x8, x8, x0                                                                   
// Meaning x0 (the memory to unmap) = 0xffff91b7a000
  mov     x0, x20                                                                      
  add     x8, x8, #0x46               
// Meaning x1 (the size to unmap) = 4096 (the page size)                                                 
  and     x1, x8, x9                                    
// Unmap that area                               
  bl      0x40ba00 <_ZN11__sanitizer15internal_munmapEPvm>                             
dont_unmap:
// In the core dump we reach here after (I assume) calling unmap on the line above
  ldr     x8, [x19]                                                                    
  movi    v0.2d, #0x0                                                                  
  mov     w0, wzr                                                                      
  ldr     x9, [x22]                                        
// Here we try to dereference a pointer
// x8 = 0xffff91b7a080
// 0xffff91b7a080 - 0xffff91b7a000 = 0x80 so this is in unmapped memory
// Program terminated with signal SIGSEGV, Segmentation fault.
// #0  0x0000000000419744 in __xray::BufferQueue::releaseBuffer(__xray::BufferQueue::Buffer&) ()                       
  ldr     x8, [x8]                                                                     
// This dmb is the result of merging the load atomic then store atomic calls.
// Load does a dmb after and store does one before.
  dmb     ish                                                                          
  str     x8, [x9]
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzFWlt327gR_jXyC451SOpi6cEPtuNmfU42m9ppk_ZFByRBiWuSUAHQkvLr-82ApEjZ2U1jt9WRbVEE5j7fzICOdXq4_KLEKFp8yCtp9Chaio18UsIqVQm3ya1wyjqRybwQOkmkzXUli-IgdCV0bUSsHZZspBPSKGHqqsqrNd2Uwm7wVSpKmWzySo1HwbtRcOV_3zkBykltjKociKW5lXGBxZk24urKJJv5lITaOLe1o8nVKPoL3uvcbep4nOgSF0Xx1P453xr9u0ocLnGvzOmDyqZpGIbLJAkm8_hCZgv8DReLyYVSi4Vc4JtwIcOIFJZV-jZM51EyuchkJJNwmapEZhfhdDlNFyoI1CQM1SxVs2i5mBHTtFbCaTbx0DSj6KIohHXSOLEDc7Ej697RyuoRv5UAZxirhAlH0Y1IlYNzYDujpNVs_kI6Zcbig5YpXdKeRBuVYRkIrbUj7TzPedC8-fKT0WsjS_jclIgHB6osgs3X8Lp4uHv_cPv-78T1Qa1L-E46xAOioy5cq0U0CYQI9kH3mobLi-lU5JVYrfZGHoj55Oq6zjJl_lqrWvkvjCqggPLfww1_uLhdNSdTYjH-NOxn1zc-rKA2LJJSoIW05LO_DPYZXstwlmWQjSP_yycxixbz-YSo4D1796Jx_O_PAwf4FEl0qkg_zoOsrhKyyvdsLOCx5F5lN7quEDQLqDK-3TtIbK9l8giHPTg4i4z82Hz_kH_ja1rabFuOJtctPel0mScr67ctrs9Hk1syUEuWtjZrCoTEkCfdLFWpzWGlTarMSib_qnMiBFvctCwGr8Hyxm19iV62mxblQdQV9iC2Kw7MniFgw7pI8Q3ARSeIPI7aRkYRe8MIVlHEij7XFtEZH9go4jO5oVSyasBotwF8HXQtnDlQmoGrQsCoKlGip7zIndjqnD5hUV2VcrslyGIFB2n5ZXPwvk61shVy1AEnsRhAh2wlWV1espPuRCL9ArXfFjKvBnQ-KYN9tslJhI3dsOIUm9-PmBMPc9y_5Oflc1c3a-vTVf8Lnz_oxhuKq4PdwWItID3JAgios4Gf2ReU1zei0o5vcTQgndhkB78CzhqLX_ROPSkDe6fa2_sYXLwzZowAczhIVbpeb8jJVh48otIS4gZyG0SGOMHhG2WAq1VxOLpKWsr4QutH_M4fFQcYKYY4NorgsFWOg41iLeHYpqpmdakIwte8aydZCyGTRFlL9ZK2VWrvQBnFUnzZ5MmGqnAJphXQ26DWQMOClrtGYCbqxQUkbSHxQVD9KEWZW0u8eOGOafWyA-I1ku1yGKdiO8ZKfFMGIjkf6GzpgUnOz8_7l79qVp2qDzRE2u6GFrxXECqtEzYKjIaWAJIU5BjnyxC-obokuDBlRpeUjxCmcjX1GJTWiBqnfDluysuVeODaSHsrchFkb5uROi9SdCRtz3Hcck13qKiuFaoBoo5KOBd-tCwCYYu6-TsoblTyeM51p4tk7H5vZOxznHqhbVMkaXeitwcSmhyxY2NzolAzc9z-xeTAMzREicm3zocOUfNGg3mgHRUihLzyDgrngS8uiIm6jEnDjE3VhUqrITLFag93BeSANgxqwMVUcVmmGLJMcKfNI_LjWCpPUjZC_QvRxcR5Rb-l3fS9aZUT56r_DUU1A8xCAAb8jy-gfDvVHZD0ViajybuAluLDDT5DMH8VXeM9IICNPRp4QcKv91f_WP326fPdbx8fsHsURQjdWNvcHYgYrsUYwmepWeVV7lbksLEWBII9Qn-2BICiuoudRG_XXuSZj4HzDBuihOE4IgXgk2oAnCrZENHoAZHh9yQGJlXA5BDdkzpG_phIDLbu4cHwaL28-9gSZdv99Pv2zd6t4EeDPQuqK_EhR3fcoGiTsFxup01EIywYcEqsS3zQI1TJYKgxSBqFwSTXHESyAhim-VOeAiBo9jgGMxjd1z43mjzjDYTXwB_JWeHko2LW7Api7AGWM3J0cRtGAtUY0qUKlT1tJhkiqQlyPcMLC9R2kIHA1nflRymaFhEpTPBWqgHCEWDcQay6KVBd_mOxrxzGVynco4nIWlXGBIN3giI157nLbWgN19Qd4Q2VI4IQ5kxjGExkLUorN-5EyvcDlmFXYpBLnC-lVLnaKuu3EICNySy-iNBNkEMbg8riLQEGto793HOic68kQVxUsaNiNA6S7kz1qDXboOmWrR-GVOOa58XnWWDRcCaedJ4-76eHs8INos7o4hqdxCOUu7ph0ERLvXKi7ayby7a5RjBcD_ggb8bjMYV8_1vGA7BsOq5MuWSzgn1823VD_dlRsBuk9At9FzVTzBFoNnlHazr4K1K5N2CyX9BGTDb7KMBwIt721XKD3FZ03BqeGOb24duxcAUrtFv2WLyVWgPP3Pm-0s8LO2Q_xbKtqdvKarQ8j0ptaTagMMypr3JoYnI6lRiQGR4FpGhNCmqzxtKUzYlAqpO6G4Tpus6DRUBldIOfq_n0_J108vyzQfOFsDy_q6wztQ95LHj4_OHrfWueJK6-ic48wR6D81wFFHurf36c-3k4DHvxHU4GQ_Pt_ceHYDVvLlBOJ7PoNGQbwxynId8H5mShHX4q3wRSOev11ZiLmsAdEAN6ikLGqqClbsXrWmXicaVIGVbjYjJ9hRrTyRxqUP1nrvFYVkM5fiN03uVW9TrbRiPSxp-SdIPl9wbugZl8Q8kVqD162REqp401uAdji3D_xbe8tajI-A46htOTzbgb4lKzJYvsqdFi_05SOoJ4k7gntDCiR57zatIk8WS5mL4yxZ6xCXpcfpz0aTBuqSMn9PUOo6KfcJN0I9boOHMGTt35d62pTtCmFUO2Xz0-zSAvnA-9IHlNBi196PUtEBf-GuSDRHYJGoarlZVUpxFzYfheuU8QkwrM7VOXhYSBr7Lfn9v21Di90YmN_tANonQuon2fYKgp4VMVH9rsEZ0kaIao7h1LhJfdl4jgv1AiHPo-ZhH4EvHyukaVjwzqPOG2Uysdz3CnQympBxmpmoQkwjzMScMhhvXqoKt0QJrPWfmswY91IMfjlEpfqeAQF-evCc7pMvoOLiJNrY_SxlVR2Phq_pPiv6BJpdaeha_mrwayE_Iy9abu9yNvwmTg51-V5EZ1H_g5pEP19jyw6c26c-P4QgK0WyFL_eSF9OEavZER_sgGnHDTUz--rFTYKsX5fKrSNFjO2wUdDtPNnxa3asQNey77IXID8f_WVFj_UEn-IPM-LMcyeBmWZ3y-WMliVbIpbj89lR3A_yGDXodzAqh3VfeERaR1uSWsgNjJxp--yQwceajCFGrrki2coBthuGVVmyGTTv6EjPWTelZsu1Y5XL5pseAIzun6KRhHaRdfbxDHpymy4xTZfTOvJ90n35lo2dXT6Gf7kV8apH_hoYHszqb7G_aLZ9iwGA6ow1vi_ARHmu24Y_2zQCpHeXX6KGJA8k0e0vmp9v_1rO5HfdqF_eKt5t6B_vzUKC3j9qzVKEsnpjqD2c26PaKgBynNIYqfi_xU2XxDmWyHZqXnrr4RkUzdIwCBo9_Jt3RFD7IyXHftFa2lV243b6JrT-Fe39nZ9IUHnWfp5SRdTpbyzOWuUJdf7-VBZKk5N4pOoMbJdnv8fwD77B8CrgyNxO2jdPrXgLPaFJf_8QN1Po2i8Xg2WczmZ5tLlWZJlsp0pkKVxlEaJXIRTeNkOpWoYWl6xlOovYRaoyjiWGQUOMsvoyDCO5yFs-gimI3DYD6RCpunoKzm4WgaqBLKjEmEsTbrM3PJ0sT12uJmkVtnjzeB4MgypZgT6MvabbS5fCef8vRhmyePyrkzFv-SZf83vO94Mw">