[libcxx-dev] A proposed approach to std::atomic_wait for libcxx

Mon Jun 3 12:31:42 PDT 2019

Seems fine to me.

Dumb question: What (if anything) should be done to support platforms with a single threaded system.  Assert if not immediately available?  Optimistically spinlock in case they are using this on “elsewhere” memory of some sort?

Pedantic: I probably wouldn’t call the Windows implementation futex based ( https://devblogs.microsoft.com/oldnewthing/20170601-00/?p=96265 ).  I wouldn’t go and call the Linux implementation WaitOnAddress based either.

From: libcxx-dev <libcxx-dev-bounces at lists.llvm.org> On Behalf Of Olivier Giroux via libcxx-dev
Sent: Tuesday, May 28, 2019 5:55 PM
To: libcxx-dev at lists.llvm.org
Subject: [EXTERNAL] [libcxx-dev] A proposed approach to std::atomic_wait for libcxx

Hi everyone,

To start the discussion of how libcxx should go about implementing this feature, I’ve prepared a cross-platform implementation of atomic_wait/atomic_notify_* for Mac/Linux/Windows/CUDA/unidentified-platform. I don’t claim that it’s fully tuned for your platform, nor do I claim that it’s perfect for every possible use, but it should not be terribly bad for any use either. It has various knobs to turn paths On/Off, so you can choose a different path on each platform, so long as it’s supported at all on that platform.

You can find the implementation here: https://github.com/ogiroux/atomic_wait/<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ogiroux_atomic-5Fwait_&d=DwMGaQ&c=I_0YwoKy7z5LMTVdyO6YCiE2uzI1jjZZuIPelcSjixA&r=y8mub81SfUi-UCZRX0Vl1g&m=52uONGiTLQpEB5Xuf3oJR6LnVXwuDGP-HiraUV4P5I8&s=_UMntk1qhxJm9tQFWUvq7zDZ3A9Pfse6ouh9047XIE8&e=>.

It has these strategies implemented:
* Contention table. Used to optimize futex notify, or to hold CVs. Disable with __NO_TABLE.
* Futex. Supported on Linux and Windows. For performance requires a table on Linux. Disable with __NO_FUTEX.
* Condition variables. Supported on Linux and Mac. Requires table to function. Disable with __NO_CONDVAR.
* Timed back-off. Supported on everything. Disable with __NO_SLEEP.
* Spinlock. Supported on everything. Force with __NO_IDENT. Note: performance is too terrible to use.

The strategy is chosen this way, by platform:
* Linux: default to futex (with table), fallback to futex (no table) -> CVs -> timed backoff -> spin.
* Mac: default to CVs (table), fallback to timed backoff -> spin.
* Windows: default to futex (no table), fallback to timed backoff -> spin.
* CUDA: default to timed backoff, fallback to spin. (This is not all checked in in this tree.)
* Unidentified platform: default to spin.

The unidentified platform support could be better. For instance, we should probably assume that <thread> is implemented and use the sleeping/yielding facilities there. It should not fall all the way back to __NO_IDENT, it should instead fall back to about where CUDA is expected to be.

One of the main discussion points I’d like to drive with this, is the design of the contention management table, to go along the sharded lock table that backs __atomic_* and __c11_atomic_* built-ins. Ideally this would be handled the same way, meaning that it should live in libatomic.a or your substitute, and be shared with other C++ standard libraries on your platform.

Please discuss!

Sincerely,

Olivier

________________________________
This email message is for the sole use of the intended recipient(s) and may contain confidential information.  Any unauthorized review, use, disclosure or distribution is prohibited.  If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
________________________________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/libcxx-dev/attachments/20190603/a79defee/attachment.html>