Important To Learn std::memory_order In C++ Atomic Operations

Using multi-threading development skills on the CPU, GPU, and memory operations is important in programming, but it can give rise to problems in synchronization of multi-thread operations and reaching data for reading and writing. The concurrency support library in modern C++ is designed to solve problems in multi-thread operations. Since the C++11 standard, this library includes built-in support for threads (std::thread) with atomic operations (std::atomic). The memory_order type definition defines how to access memory in multi-thread operations, including atomic, regular, and non-atomic memory accesses. In this post, we explain how to use std::memory_order in modern multi-threading operations.

Concurrent programming is highly evolved and inherently diverse, and includes high-performance multi-threading and parallel programming features, asynchronous task processing, message-based and event-based systems, non-blocking, lock-free, and optimistic data structures, transactional approaches, and many other features to build multi-threading applications. The memory_order feature is one of the most important parts of the new C++ standards. For example, the C++20 standard has about 653 “memory_order” keywords if you perform a search. In other words, it means it has a very important role in atomic and atomic operations in multi-threading applications. Before this, let’s remind ourselves about std::atomic in C++.

What is atomic (std::atomic) in C++?

C++11 adds atomic types and operations to the standard. Atomic types and operations provide a way of writing multi-threaded applications without using locks. In modern C++, the std::atomic<> template class is defined in the header and it can be used to wrap other types to do atomic operations on that type. When a thread writes to an atomic object another thread can read from it. Every instantiation and full specialization of the std::atomic template defines an atomic type.

Atomic types ensure any read or write operation synchronizes as part of multi-thread operations, (i.e. using these types in std::thread). They work well on private types (i.e. int, float, double, etc.) or any type that is trivially copyable types which means it has at least one eligible copy constructor, move constructor, copy assignment operator, or move assignment operator and has non-deleted trivial destructor.

Here is a simple syntax for the atomic declaration:

Here is a simple std::atomic example:

Atomic operations are operations on the of values atomic types (std::atomic objects) in the atomic library that allows lockless concurrent programming. These operations are good in data races and these objects are free of data races. Different atomic operations can be done on the same atomic object in their sync.

std::atomic has many features to be used in atomic operations, i.e. load, store, operator=, wait, exchange, is_lock_free, etc. Let’s see these load and store operations now.

What is std::memory_order in C++ atomic operations?

The memory_order type definition defines how to access memory in multi-thread operations, including atomic, regular, and non-atomic memory accesses. The memory_order type definition is used in multi-thread atomic operations when multiple threads simultaneously read and write to different variables in memory. In this concept, threads can observe the value changes in order or threads written into atomic types within memory order.

Here is the definition syntax from C++11 to C++17,

Since C++20, there are changes on this feature, here is the syntax,

The default memory_order of all atomic operations in the concurrency library is sequentially consistent ordering (memory_order_seq_cst) type

What are std::memory_order models in C++ atomic operations?

Here are the features of each memory_order model,

memory_order_seq_cst Sequentially Consistent Ordering
Sequantial-consistent ordering operations orders memory as acquire/release ordering ( lock / unlock data ) and also establish a single total modification order of all atomic operations that are so tagged
– this is default in all atomic operations
– read operations perform an acquire operation,
– write operations perform a release operation,
– read-modify-write operations have both an acquire and a release operations
– a single total order exists where all the threads observe all modifications in the same order
memory_order_relaxed Relaxed Ordering
Relaxed operations are not synchronization operations
– doesn’t impose an order among concurrent memory accesses
– much less synchronization by removing the happens-before restrictions
– mostly used when you want atomicity is guaranteed,
– there is no synchronization or ordering constraints imposed on reading or writing operations
memory_order_consume Consume Ordering
Consume operation is a memory operation where a value read from memory is used after the load in several operations, and creates a data dependency.
– read operations perform a consume operation 
– no reads or writes in the current thread dependent on the value currently loaded can be reordered before this load
– writes to data-dependent variables in other threads that release the same atomic variable are visible in the current thread
memory_order_acquire Acquire Ordering
Acquire semantics is a property that can only apply to operations that read from shared memory, whether they are read-modify-write operations or plain loads. i.e. mutex::lock() is a acquire operation.
– read operations perform the acquire operation on the affected memory location
– write operations release the same atomic variable are visible in the current thread
– no reads or writes in the current thread can be reordered before this read
memory_order_release Release Ordering
Release semantics is a property that can only apply to operations that write to shared memory, whether they are read-modify-write operations or plain stores, i.e. mutex::unlock() is a release operation.
– no reads or writes in the current thread can be reordered after this store
– write operations perform the release operation
– all writes in the current thread are visible in other threads that acquire the same atomic variable
– writes that carry a dependency into the atomic variable become visible in other threads that consume the same atomic
memory_order_acq_rel Acquire Release Ordering
Acquire Release Ordering operations are lock, modify and unlock data ordering operations
– read-modify-write operations are both an acquire operation and a release operation
– no memory reads or writes in the current thread can be reordered before the load, nor after the store
– all writes in other threads that release the same atomic variable are visible before the modification
– the data change is visible in other threads that acquire the same atomic variable

Here are more information about std::memory_order types https://en.cppreference.com/w/cpp/atomic/memory_order

How to use std::memory_order in C++ atomic operations?

Using load, atomic_load_explicit with memory order specifies how memory accesses are to be ordered in atomic load operation. These memory order types are defined in atomic library, here are permitted memory orders for atomic load,

  • memory_order_seq_cst
  • memory_order_relaxed
  • memory_order_consume
  • memory_order_acquire    

We can use store, atomic_store_explicit with memory order. Here are the permitted memory synchronization orders are for std::atomic_store_explicit are:

  • std::memory_order_relaxed 
  • std::memory_order_release
  • std::memory_order_seq

How to use std::memory_order in C++ atomic operations?

Here is a full example about different memory models used in the types of load and store atomic operations in modern C++.

Note that this is just a simple example that shows all memory order models in usage here, not a good example to use each of models.

For more details about std::memory_order please read these https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1680.pdf

What Is Atomic stdatomic In Modern C++ C++ Builder logo

C++ Builder is the easiest and fastest C and C++ IDE for building simple or professional applications on the Windows, MacOS, iOS & Android operating systems. It is also easy for beginners to learn with its wide range of samples, tutorials, help files, and LSP support for code. RAD Studio’s C++ Builder version comes with the award-winning VCL framework for high-performance native Windows apps and the powerful FireMonkey (FMX) framework for cross-platform UIs.

There is a free C++ Builder Community Edition for students, beginners, and startups; it can be downloaded from here. For professional developers, there are Professional, Architect, or Enterprise version.