[libc-commits] [libc] [libc] Implement simple lock-free stack data structure (PR #83026)

Jon Chesterfield via libc-commits libc-commits at lists.llvm.org
Mon Feb 26 10:23:16 PST 2024


JonChesterfield wrote:

I am concerned with shipping code that has a known data corruption race condition that hits with period on the order of days. The initial use may be light enough that the failure modes are latent, it doesn't follow that the struct won't be used for other things in the future. Especially given the application domain of HPC / AI where jobs are at credible risk of running for multiple hours at a time.

I think there's a known fetch_add stack in the literature. There's a queue https://dl.acm.org/doi/10.1145/2851141.2851168 that might be adaptable.

I'd suggest a structure more like

    template <typename T, size_t N>
    struct stack
    {
       T value[N]; 
       bitmap<N> in_use;
    };

Let value[0] be the first value written.

Push is find-first-set to choose a candidate slow, then fetch_or to set that bit. Success when fetch_or returned a zero in the corresponding position, failure means try push again. Out of space when all bits are in use.

Pop is fetch_and to clear the corresponding bit.

For atexit you don't need a stack - you need push() and iterate() which is simpler than pop(), a pointer and fetch_add is sufficient.

https://github.com/llvm/llvm-project/pull/83026


More information about the libc-commits mailing list