Home > Backend Development > C++ > How Does x86 Achieve Release-Acquire Semantics Without Explicit Fences?

How Does x86 Achieve Release-Acquire Semantics Without Explicit Fences?

Linda Hamilton
Release: 2024-12-07 14:39:13
Original
544 people have browsed it

How Does x86 Achieve Release-Acquire Semantics Without Explicit Fences?

x86 Memory Model and Release-Acquire Semantics

In a follow-up to a previous inquiry regarding the implementation of memory_order_release atomic stores using MOV x86 instruction, questions have been raised about how release-and-acquire is achieved on x86 without additional synchronization mechanisms like fences or locks.

Single-Processor Memory Ordering

Intel's Volume 3A Chapter 8 of its System Development Manual states that in a single-processor system, loads and writes have certain ordering guarantees:

  • Reads are not reordered with other reads.
  • Writes are not reordered with older reads.
  • Writes to memory are not reordered with other writes, except under specific circumstances.

Multi-Processor Memory Ordering

However, the multi-processor section of the document does not explicitly mention how loads are enforced. It focuses on:

  • Maintaining individual processor ordering rules within cores.
  • Observance of writes in the same order by all processors.
  • Absence of ordering between writes from different processors.
  • Memory ordering obeying causality (i.e., respecting transitive visibility).
  • Consistency of store order visibility by other processors than those performing the stores.

Acquisition and Release Using MOV

Understanding the single-processor memory ordering principles is crucial. According to the manual, the same principles apply to multi-processor systems when accessing cache-coherent shared memory. This means that reordering only occurs locally within each CPU core.

Once a store becomes globally visible, it becomes visible to all cores simultaneously, ensuring that all cores observe the writes in a consistent order. This is why only local barriers, such as mfence in x86, are required to establish sequential consistency.

Coherent Shared View of Memory

The x86 architecture maintains a coherent shared view of memory through coherent caches. Intel's definition of multiprocessing mechanisms highlights the importance of memory coherency and cache consistency to ensure that all processors have access to the same data.

Additional Resources

  • https://preshing.com/20120913/acquire-and-release-semantics/
  • https://preshing.com/20120930/weak-vs-strong-memory-models/
  • https://www.cl.cam.ac.uk/~pes20/papers/x86-tso-memory-model.pdf

The above is the detailed content of How Does x86 Achieve Release-Acquire Semantics Without Explicit Fences?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template