Understanding MongoDB Storage Engines: WiredTiger Deep Dive-MongoDB-php.cn

WiredTiger is MongoDB’s default storage engine since version 3.2, providing high performance, scalability, and modern features. 1. It uses document-level locking and MVCC for high concurrency, allowing reads and writes to proceed without blocking each other. 2. Data is stored using B-trees, with in-memory modifications flushed to disk during periodic checkpoints (every 60 seconds by default), enabling fast crash recovery by replaying only post-checkpoint logs. 3. MVCC ensures consistent snapshots, so long-running queries aren’t blocked by writes and vice versa. 4. Built-in compression (Snappy, Zlib, Zstd) reduces storage and I/O for both collections and indexes, improving cache efficiency. 5. WiredTiger manages its own cache (default: 50% of RAM minus 1GB), which stores data, indexes, and metadata—proper sizing is critical to avoid excessive disk I/O. 6. The Write-Ahead Log (WAL) guarantees durability by logging writes before applying them, with fsync intervals every 100ms; j: true ensures journal sync for safety, while j: false trades durability for speed. 7. Alternatives may be considered for low-memory systems (

Understanding MongoDB Storage Engines: WiredTiger Deep Dive

MongoDB’s performance and scalability heavily depend on its storage engine, and WiredTiger is the default since MongoDB 3.2. If you're using a modern version of MongoDB, chances are you're already using WiredTiger—so understanding how it works under the hood can help you optimize performance, manage resources, and troubleshoot issues.

Let’s break down what makes WiredTiger powerful and how it handles data behind the scenes.

What Is WiredTiegel and Why It Matters

WiredTiger is a high-performance, scalable storage engine designed for workloads with high concurrency and large datasets. Unlike the older MMAPv1 engine, WiredTiger brings modern database features like document-level concurrency control, compression, and crash-safe writes.

Key advantages:

Document-level locking (vs. collection-level in MMAPv1) → higher concurrency
Snapshots and MVCC (Multi-Version Concurrency Control) → consistent reads without blocking writes
Configurable compression → reduced storage footprint
Write-ahead logging (WAL) → durability and crash recovery

It's built to scale efficiently on modern hardware, especially systems with multiple cores and SSDs.

How WiredTiger Stores Data: B-Trees and Checkpoints

At its core, WiredTiger uses a B-tree (balanced tree) structure to organize data on disk and in memory.

In-Memory vs. On-Disk Structure

When you insert or update a document, it first goes into the in-memory B-tree.
Changes are tracked in the WiredTiger log (WAL) for durability.
Periodically, WiredTiger creates a checkpoint—a consistent snapshot of the data written to disk.

This means:

Reads can access either the in-memory tree (fast) or the last checkpoint (on disk).
Writes are non-blocking thanks to MVCC and checkpoints.

Checkpointing Explained

Checkpoints are crucial for recovery and performance:

They occur every 60 seconds by default (configurable).
During a checkpoint, dirty pages (modified in memory) are flushed to disk in a consistent state.
After a crash, MongoDB replays only the log entries after the last checkpoint.

This reduces recovery time significantly compared to replaying the entire oplog or journal.

Concurrency and Isolation: MVCC in Action

WiredTiger uses MVCC (Multi-Version Concurrency Control) to allow concurrent reads and writes without locks blocking each other.

Here’s how it works:

Each operation runs in a snapshot of the database at a point in time.
Readers see a consistent view, even if writes are happening.
Writers don’t block readers, and vice versa (as long as they’re not modifying the same document).

For example:

A long-running analytics query won’t be blocked by incoming inserts.
Two updates to different documents can proceed in parallel.

This is a big win for applications requiring high throughput and low latency.

Storage Efficiency: Compression and Indexing

One of WiredTiger’s standout features is built-in compression, which saves disk space and reduces I/O.

Data Compression Options

WiredTiger supports:

Snappy (default): Fast, moderate compression
Zlib: Higher compression, slower
Zstd (from MongoDB 4.2 ): Balanced speed and ratio

Compression is applied to:

Collections (data)
Indexes

You can configure compression per collection:

db.createCollection("logs", {
  storageEngine: {
    wiredTiger: {
      configString: "block_compressor=zstd"
    }
  }
})

Index Behavior

All indexes use the same storage engine.
Index keys are also compressed.
Secondary indexes are updated incrementally and benefit from the same concurrency model.

? Tip: Compressed indexes reduce memory pressure in the WiredTiger cache, allowing more hot data to stay in RAM.

WiredTiger Cache: Managing Memory Usage

WiredTiger manages its own in-memory cache, separate from the filesystem cache.

By default:

The cache size is 50% of RAM minus 1 GB, capped at 1 TB.
It holds:
- Most recently used data (working set)
- Indexes
- Internal metadata

You can adjust it in mongod.conf:

storage:
  wiredTiger:
    engineConfig:
      configString: "cache_size=4G"

⚠️ Warning: Don’t set it too high. Leave room for the OS filesystem cache and other processes.

If your working set exceeds the cache size, you’ll see more disk I/O and slower queries.

Logging and Durability: Write-Ahead Log (WAL)

WiredTiger uses a Write-Ahead Log (WAL) to ensure durability:

Every write is first written to the log.
Logs are fsynced every 100ms by default (commit_timestamp_frequency).
This guarantees that even if MongoDB crashes, no committed data is lost.

The log is stored in dbPath under journal/.

While checkpoints write data to disk periodically, the WAL ensures granular durability between checkpoints.

You can tune durability vs. performance:

Set j: true on writes for guaranteed journal sync.
Or use j: false for higher throughput (but risk of minor data loss on crash).

When to Consider Alternatives?

While WiredTiger is excellent for most use cases, it’s not always the best fit.

Avoid or reconsider if:

You're on very low-memory systems (
You need encrypted at rest with older MongoDB versions: Only Enterprise offers native encryption (though filesystem-level encryption works).
You’re migrating from MMAPv1 and have specific performance expectations (benchmark first).

There’s also in-memory storage engine (Enterprise only), which keeps all data in RAM for ultra-low latency—useful for caching or real-time analytics.

Final Thoughts

WiredTiger is a robust, modern storage engine that powers most MongoDB deployments today. Its combination of concurrency, compression, and crash safety makes it ideal for everything from small apps to large-scale distributed systems.

To get the most out of it:

Monitor cache usage (db.serverStatus().wiredTiger.cache)
Tune compression based on your I/O vs. CPU trade-off
Understand checkpoint and journal behavior for disaster recovery
Size your RAM appropriately for cache and system needs

It’s not magic—but with the right understanding, it’s a powerful tool in your MongoDB toolkit.

Basically, if you're using MongoDB post-3.2, you're likely already benefiting from WiredTiger. Now you know how.

The above is the detailed content of Understanding MongoDB Storage Engines: WiredTiger Deep Dive. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undress AI Tool

Undress images for free

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

RimWorld Odyssey Temperature Guide for Ships and Gravtech

1 months ago By Jack chen

RimWorld Odyssey How to Fish

1 months ago By Jack chen

Can I have two Alipay accounts?

1 months ago By 下次还敢

Beginner's Guide to RimWorld: Odyssey

1 months ago By Jack chen

PHP Variable Scope Explained

3 weeks ago By 百草

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Laravel Tutorial

1603

PHP Tutorial

1508

276

Related knowledge

MongoDB vs. Oracle: Exploring NoSQL and Relational Approaches May 07, 2025 am 12:02 AM

In different application scenarios, choosing MongoDB or Oracle depends on specific needs: 1) If you need to process a large amount of unstructured data and do not have high requirements for data consistency, choose MongoDB; 2) If you need strict data consistency and complex queries, choose Oracle.

Various ways to update documents in MongoDB collections Jun 04, 2025 pm 10:30 PM

The methods for updating documents in MongoDB include: 1. Use updateOne and updateMany methods to perform basic updates; 2. Use operators such as $set, $inc, and $push to perform advanced updates. With these methods and operators, you can efficiently manage and update data in MongoDB.

MongoDB's Purpose: Flexible Data Storage and Management May 09, 2025 am 12:20 AM

MongoDB's flexibility is reflected in: 1) able to store data in any structure, 2) use BSON format, and 3) support complex query and aggregation operations. This flexibility makes it perform well when dealing with variable data structures and is a powerful tool for modern application development.

How to view all databases in MongoDB Jun 04, 2025 pm 10:42 PM

The way to view all databases in MongoDB is to enter the command "showdbs". 1. This command only displays non-empty databases. 2. You can switch the database through the "use" command and insert data to make it display. 3. Pay attention to internal databases such as "local" and "config". 4. When using the driver, you need to use the "listDatabases()" method to obtain detailed information. 5. The "db.stats()" command can view detailed database statistics.

MongoDB vs. Oracle: Document Databases vs. Relational Databases May 05, 2025 am 12:04 AM

Introduction In the modern world of data management, choosing the right database system is crucial for any project. We often face a choice: should we choose a document-based database like MongoDB, or a relational database like Oracle? Today I will take you into the depth of the differences between MongoDB and Oracle, help you understand their pros and cons, and share my experience using them in real projects. This article will take you to start with basic knowledge and gradually deepen the core features, usage scenarios and performance performance of these two types of databases. Whether you are a new data manager or an experienced database administrator, after reading this article, you will be on how to choose and use MongoDB or Ora in your project

Commands and parameter settings for creating collections in MongoDB May 15, 2025 pm 11:12 PM

The command to create a collection in MongoDB is db.createCollection(name, options). The specific steps include: 1. Use the basic command db.createCollection("myCollection") to create a collection; 2. Set options parameters, such as capped, size, max, storageEngine, validator, validationLevel and validationAction, such as db.createCollection("myCappedCollection

Operation commands to sort documents in MongoDB collection Jun 04, 2025 pm 10:27 PM

In MongoDB, you can use the sort() method to sort documents in a collection. 1. Basic usage: Sort by specifying fields and sorting order (1 is ascending and -1 is descending), such as db.products.find().sort({price:1}). 2. Advanced usage: It can be sorted according to multiple fields, such as db.products.find().sort({category:1,price:-1}). 3. Performance optimization: Using indexing, avoiding oversorting and paging sorting can improve efficiency, such as db.products.createIndex({price:1}) and db.products.f

What is GridFS, and when should it be used for storing large binary files in MongoDB? Jun 06, 2025 am 10:50 AM

GridFS is a tool in MongoDB for storing and retrieving files with a size limit of more than 16MBBSON. 1. It divides the file into 255KB blocks, stores them in the fs.chunks collection, and saves the metadata in the fs.files collection. 2. Suitable situations include: more than 16MB of files, the need to manage files and metadata uniformly, access to specific parts of the file, and using MongoDB without introducing external storage systems. 3. GridFS is automatically stored in chunks when uploading, reorganizes files in order when reading, and supports custom metadata and multi-version storage. 4. Alternative solutions include: storing the file path in MongoDB and actually storing it in the file system,

See all articles