MongoDB – WiredTiger Storage Engine

MongoDB – WiredTiger Storage Engine

Overview

Storage engine is an integral part of the MongoDB database as it defines the way relational and non-relational data is stored in the disk.The types of storage engines in MongoDB are:

In-memory

  • Persists all data in memory.
  • Good for testing purpose as all data will be erased on stopping MongoDB.

MMAPv1 (default until v3.0)

  • MongoDB’s original storage engine based on memory mapped files.
  • Excels at workloads with high volume inserts, reads, and in-place updates.

WiredTiger (default from v3.2)

  • Uses document-level concurrency control for write operations
  • Allows modification of different documents of a collection by multiple clients at the same time.
  • Provides three compression options for collections:
    • zlib – lower performance and higher compression than Snappy
    • Snappy – higher performance and lower compression than zlib

One of the most compelling features of the WiredTiger storage engine is compression.In this blog, we will discuss about different compression techniques used in MongoDB-v3.x.

Use cases

We will discuss about the below use cases in this article:

  • Initialize MongoDB v3.x with WiredTiger (zlib).
  • Upgrade Database’s Storage Engine (MMAPv1 to WiredTiger) on existing data.

Prerequisites

  • Install MongoDB v3.0 or later version.
  • Install Robomongo.
  • Kill MongoDB, if running.

select

Use Case 1: Initializing MongoDB v3.x with WiredTiger (zlib)

Configuration Settings: Start MongoDB using “mongod_v3.conf” configuration file. Find the file link in the reference section.

Enabling WiredTiger with Block Compressor “zlib” Configuration Parameters:

Note:  - dbPath: should be a new directory and not an existing one – To configure Snappy block compressor: change blockCompressor: snappy.

storage.engine (“wiredTiger” or “mmapv1″)

  • MMAPv1 is the default engine in MongoDB 3.0.
  • But, WiredTiger is the default engine in MongoDB 3.2.

storage.wiredTiger.engineConfig.cacheSizeGB

  • Assign 50% of total addressable memory by default. This sets up a page cache for WiredTiger to cache frequently used data and index blocks in GB.

storage.wiredTiger.engineConfig.directoryForIndexes

  • Stores indexes on a separate block device to help achieve DBAs size, capacity plan, and augment performance as needed.

storage.wiredTiger.collectionConfig.blockCompressor (zlib or Snappy)

  • zlib – lower performance and higher compression than Snappy
  • Snappy – higher performance and lower compression than zlib

storage.wiredTiger.indexConfig.prefixCompression (true or false)

  • Enables prefix compression for indexes

Steps to test MongoDB v3.x with WiredTiger Enablement

  • Start MongoDB with new configuration.
  • Connect Robomongo to MongoDB.
  • Create collection.
select

  • Check storage engine & other configuration of collection (collection1) using stats command.
select

Check markers in image guarantees that WiredTiger is enabled.

Note: If collections are not created with new storage engine, assign dbpath directory to new location. This may be due to existence of old storage engine in the dbpath directory. MongoDB utilizes old storage engine if dbpath directory contains any MongoDB old database.For example: dbpath – Suppose, /data/mongodb_v2.9/ has database with mmapv1 storage engine. Even after changing storage engine and restarting MongoDB, the changes will not be reflected.

Usecase 2: Upgrade Database’s Storage Engine on Existing Data

To upgrade database’s storage engine from mmapv1 to WiredTiger with any block compressor, perform the following steps:

  • Export all collections using mongodump command from older version.
  • Restart MongoDB with new configuration.
  • Restore all collections using mongorestore command into newly configured server.

Performance/Storage Comparison

Storage Size

Before compression select After Compression

select

Collection Size

select

Insertion Time Taken

select

Query Performance on Index Field

select

Note: The above stats are based on our real-time experience on implementing compression technique.

MongoDump/Restore Compression

In MongoDB v3.2, (–gzip) technique was introduced to compress the output more efficiently. If mongodump output is sent to the dump directory, the new feature (–gzip) compresses the individual files. The files have the suffix as .gz.

Compression Results

select

Note: Compressed dump will take more time (~x1.5) than normal dump operation.

Conclusion

With WiredTiger integration, MongoDB has gone to another higher level by:

  • Reducing storage issues.
  • Providing higher compression.
  • Reducing impact on query performance with options like creating index & few configuration params (directoryPerDB, directoryForIndexes, cacheSizeGB).

Note: The more aggressive the compression, the longer it takes to decompress the data for reads/updates.

References

mongod_v3.conf file GitHub location:

Performance Comparison:

WiredTiger:

1460 Views 1 Views Today