Understanding MongoDB Storage Engines: WiredTiger Deep Dive
WiredTiger is MongoDB’s default storage engine since version 3.2, providing high performance, scalability, and modern features. 1. It uses document-level locking and MVCC for high concurrency, allowing reads and writes to proceed without blocking each other. 2. Data is stored using B-trees, with in-memory modifications flushed to disk during periodic checkpoints (every 60 seconds by default), enabling fast crash recovery by replaying only post-checkpoint logs. 3. MVCC ensures consistent snapshots, so long-running queries aren’t blocked by writes and vice versa. 4. Built-in compression (Snappy, Zlib, Zstd) reduces storage and I/O for both collections and indexes, improving cache efficiency. 5. WiredTiger manages its own cache (default: 50% of RAM minus 1GB), which stores data, indexes, and metadata—proper sizing is critical to avoid excessive disk I/O. 6. The Write-Ahead Log (WAL) guarantees durability by logging writes before applying them, with fsync intervals every 100ms; j: true ensures journal sync for safety, while j: false trades durability for speed. 7. Alternatives may be considered for low-memory systems (
MongoDB’s performance and scalability heavily depend on its storage engine, and WiredTiger is the default since MongoDB 3.2. If you're using a modern version of MongoDB, chances are you're already using WiredTiger—so understanding how it works under the hood can help you optimize performance, manage resources, and troubleshoot issues.

Let’s break down what makes WiredTiger powerful and how it handles data behind the scenes.
What Is WiredTiegel and Why It Matters
WiredTiger is a high-performance, scalable storage engine designed for workloads with high concurrency and large datasets. Unlike the older MMAPv1 engine, WiredTiger brings modern database features like document-level concurrency control, compression, and crash-safe writes.

Key advantages:
- Document-level locking (vs. collection-level in MMAPv1) → higher concurrency
- Snapshots and MVCC (Multi-Version Concurrency Control) → consistent reads without blocking writes
- Configurable compression → reduced storage footprint
- Write-ahead logging (WAL) → durability and crash recovery
It's built to scale efficiently on modern hardware, especially systems with multiple cores and SSDs.

How WiredTiger Stores Data: B-Trees and Checkpoints
At its core, WiredTiger uses a B-tree (balanced tree) structure to organize data on disk and in memory.
In-Memory vs. On-Disk Structure
- When you insert or update a document, it first goes into the in-memory B-tree.
- Changes are tracked in the WiredTiger log (WAL) for durability.
- Periodically, WiredTiger creates a checkpoint—a consistent snapshot of the data written to disk.
This means:
- Reads can access either the in-memory tree (fast) or the last checkpoint (on disk).
- Writes are non-blocking thanks to MVCC and checkpoints.
Checkpointing Explained
Checkpoints are crucial for recovery and performance:
- They occur every 60 seconds by default (configurable).
- During a checkpoint, dirty pages (modified in memory) are flushed to disk in a consistent state.
- After a crash, MongoDB replays only the log entries after the last checkpoint.
This reduces recovery time significantly compared to replaying the entire oplog or journal.
Concurrency and Isolation: MVCC in Action
WiredTiger uses MVCC (Multi-Version Concurrency Control) to allow concurrent reads and writes without locks blocking each other.
Here’s how it works:
- Each operation runs in a snapshot of the database at a point in time.
- Readers see a consistent view, even if writes are happening.
- Writers don’t block readers, and vice versa (as long as they’re not modifying the same document).
For example:
- A long-running analytics query won’t be blocked by incoming inserts.
- Two updates to different documents can proceed in parallel.
This is a big win for applications requiring high throughput and low latency.
Storage Efficiency: Compression and Indexing
One of WiredTiger’s standout features is built-in compression, which saves disk space and reduces I/O.
Data Compression Options
WiredTiger supports:
- Snappy (default): Fast, moderate compression
- Zlib: Higher compression, slower
- Zstd (from MongoDB 4.2 ): Balanced speed and ratio
Compression is applied to:
- Collections (data)
- Indexes
You can configure compression per collection:
db.createCollection("logs", { storageEngine: { wiredTiger: { configString: "block_compressor=zstd" } } })
Index Behavior
- All indexes use the same storage engine.
- Index keys are also compressed.
- Secondary indexes are updated incrementally and benefit from the same concurrency model.
? Tip: Compressed indexes reduce memory pressure in the WiredTiger cache, allowing more hot data to stay in RAM.
WiredTiger Cache: Managing Memory Usage
WiredTiger manages its own in-memory cache, separate from the filesystem cache.
By default:
- The cache size is 50% of RAM minus 1 GB, capped at 1 TB.
- It holds:
- Most recently used data (working set)
- Indexes
- Internal metadata
You can adjust it in mongod.conf
:
storage: wiredTiger: engineConfig: configString: "cache_size=4G"
⚠️ Warning: Don’t set it too high. Leave room for the OS filesystem cache and other processes.
If your working set exceeds the cache size, you’ll see more disk I/O and slower queries.
Logging and Durability: Write-Ahead Log (WAL)
WiredTiger uses a Write-Ahead Log (WAL) to ensure durability:
- Every write is first written to the log.
- Logs are fsynced every 100ms by default (
commit_timestamp_frequency
). - This guarantees that even if MongoDB crashes, no committed data is lost.
The log is stored in dbPath
under journal/
.
While checkpoints write data to disk periodically, the WAL ensures granular durability between checkpoints.
You can tune durability vs. performance:
- Set
j: true
on writes for guaranteed journal sync. - Or use
j: false
for higher throughput (but risk of minor data loss on crash).
When to Consider Alternatives?
While WiredTiger is excellent for most use cases, it’s not always the best fit.
Avoid or reconsider if:
- You're on very low-memory systems (
- You need encrypted at rest with older MongoDB versions: Only Enterprise offers native encryption (though filesystem-level encryption works).
- You’re migrating from MMAPv1 and have specific performance expectations (benchmark first).
There’s also in-memory storage engine (Enterprise only), which keeps all data in RAM for ultra-low latency—useful for caching or real-time analytics.
Final Thoughts
WiredTiger is a robust, modern storage engine that powers most MongoDB deployments today. Its combination of concurrency, compression, and crash safety makes it ideal for everything from small apps to large-scale distributed systems.
To get the most out of it:
- Monitor cache usage (
db.serverStatus().wiredTiger.cache
) - Tune compression based on your I/O vs. CPU trade-off
- Understand checkpoint and journal behavior for disaster recovery
- Size your RAM appropriately for cache and system needs
It’s not magic—but with the right understanding, it’s a powerful tool in your MongoDB toolkit.
Basically, if you're using MongoDB post-3.2, you're likely already benefiting from WiredTiger. Now you know how.
The above is the detailed content of Understanding MongoDB Storage Engines: WiredTiger Deep Dive. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

In different application scenarios, choosing MongoDB or Oracle depends on specific needs: 1) If you need to process a large amount of unstructured data and do not have high requirements for data consistency, choose MongoDB; 2) If you need strict data consistency and complex queries, choose Oracle.

The methods for updating documents in MongoDB include: 1. Use updateOne and updateMany methods to perform basic updates; 2. Use operators such as $set, $inc, and $push to perform advanced updates. With these methods and operators, you can efficiently manage and update data in MongoDB.

MongoDB's flexibility is reflected in: 1) able to store data in any structure, 2) use BSON format, and 3) support complex query and aggregation operations. This flexibility makes it perform well when dealing with variable data structures and is a powerful tool for modern application development.

The way to view all databases in MongoDB is to enter the command "showdbs". 1. This command only displays non-empty databases. 2. You can switch the database through the "use" command and insert data to make it display. 3. Pay attention to internal databases such as "local" and "config". 4. When using the driver, you need to use the "listDatabases()" method to obtain detailed information. 5. The "db.stats()" command can view detailed database statistics.

Introduction In the modern world of data management, choosing the right database system is crucial for any project. We often face a choice: should we choose a document-based database like MongoDB, or a relational database like Oracle? Today I will take you into the depth of the differences between MongoDB and Oracle, help you understand their pros and cons, and share my experience using them in real projects. This article will take you to start with basic knowledge and gradually deepen the core features, usage scenarios and performance performance of these two types of databases. Whether you are a new data manager or an experienced database administrator, after reading this article, you will be on how to choose and use MongoDB or Ora in your project

The command to create a collection in MongoDB is db.createCollection(name, options). The specific steps include: 1. Use the basic command db.createCollection("myCollection") to create a collection; 2. Set options parameters, such as capped, size, max, storageEngine, validator, validationLevel and validationAction, such as db.createCollection("myCappedCollection

In MongoDB, you can use the sort() method to sort documents in a collection. 1. Basic usage: Sort by specifying fields and sorting order (1 is ascending and -1 is descending), such as db.products.find().sort({price:1}). 2. Advanced usage: It can be sorted according to multiple fields, such as db.products.find().sort({category:1,price:-1}). 3. Performance optimization: Using indexing, avoiding oversorting and paging sorting can improve efficiency, such as db.products.createIndex({price:1}) and db.products.f

GridFS is a tool in MongoDB for storing and retrieving files with a size limit of more than 16MBBSON. 1. It divides the file into 255KB blocks, stores them in the fs.chunks collection, and saves the metadata in the fs.files collection. 2. Suitable situations include: more than 16MB of files, the need to manage files and metadata uniformly, access to specific parts of the file, and using MongoDB without introducing external storage systems. 3. GridFS is automatically stored in chunks when uploading, reorganizes files in order when reading, and supports custom metadata and multi-version storage. 4. Alternative solutions include: storing the file path in MongoDB and actually storing it in the file system,
