How Does x86 Achieve Release-Acquire Semantics Without Explicit Fences?-C++-php.cn

How Does x86 Achieve Release-Acquire Semantics Without Explicit Fences?

Linda Hamilton

Release： 2024-12-07 14:39:13

Original

544 people have browsed it

How Does x86 Achieve Release-Acquire Semantics Without Explicit Fences?

x86 Memory Model and Release-Acquire Semantics

In a follow-up to a previous inquiry regarding the implementation of memory_order_release atomic stores using MOV x86 instruction, questions have been raised about how release-and-acquire is achieved on x86 without additional synchronization mechanisms like fences or locks.

Single-Processor Memory Ordering

Intel's Volume 3A Chapter 8 of its System Development Manual states that in a single-processor system, loads and writes have certain ordering guarantees:

Reads are not reordered with other reads.
Writes are not reordered with older reads.
Writes to memory are not reordered with other writes, except under specific circumstances.

Multi-Processor Memory Ordering

However, the multi-processor section of the document does not explicitly mention how loads are enforced. It focuses on:

Maintaining individual processor ordering rules within cores.
Observance of writes in the same order by all processors.
Absence of ordering between writes from different processors.
Memory ordering obeying causality (i.e., respecting transitive visibility).
Consistency of store order visibility by other processors than those performing the stores.

Acquisition and Release Using MOV

Understanding the single-processor memory ordering principles is crucial. According to the manual, the same principles apply to multi-processor systems when accessing cache-coherent shared memory. This means that reordering only occurs locally within each CPU core.

Once a store becomes globally visible, it becomes visible to all cores simultaneously, ensuring that all cores observe the writes in a consistent order. This is why only local barriers, such as mfence in x86, are required to establish sequential consistency.

Coherent Shared View of Memory

The x86 architecture maintains a coherent shared view of memory through coherent caches. Intel's definition of multiprocessing mechanisms highlights the importance of memory coherency and cache consistency to ensure that all processors have access to the same data.

Additional Resources

https://preshing.com/20120913/acquire-and-release-semantics/
https://preshing.com/20120930/weak-vs-strong-memory-models/
https://www.cl.cam.ac.uk/~pes20/papers/x86-tso-memory-model.pdf

The above is the detailed content of How Does x86 Achieve Release-Acquire Semantics Without Explicit Fences?. For more information, please follow other related articles on the PHP Chinese website!