Skip to content

DB Migration

Migration strategies for Azure Cosmos DB involve choosing the right tools based on the API, data volume, and downtime requirements.

1. Migration Types

Type Description Tools
Offline Stop application, migrate all data, then restart application pointing to Cosmos DB. Highest downtime. ADF, Spark, Mongoexport/import, AzCopy
Online Initial snapshot migration followed by incremental sync of changes. Minimal downtime. DMS, ADF with CDC, Kafka Connector

2. Core Migration Tools

Azure Data Factory (ADF)

  • Best for: Scale-out migrations, multi-source ingestion, and low-code ETL.
  • Performance: Integrates with the Bulk Executor Library for high-throughput writes.
  • Incremental Load: Supports Change Data Capture (CDC) for SQL and NoSQL sources.

Azure Database Migration Service (DMS)

  • Best for: Online migration with minimal downtime, specifically for Cosmos DB for MongoDB.
  • Process: Fully managed service; handles initial load and ongoing replication until cutover.

Azure Cosmos DB Live Data Migrator

  • Best for: NoSQL API migrations.
  • Capability: Supports live migration of data between containers or accounts.

Spark Connector (v3)

  • Best for: Extremely high-volume migrations (terabytes) using the power of Spark.
  • Mechanism: Uses the Bulk Executor Library internally to maximize RU usage.

Azure Cosmos DB Data Migration Tool

  • Best for: Smaller, one-time offline migrations from JSON, CSV, SQL Server, or MongoDB.
  • Type: Open-source desktop/command-line tool.

3. Migration Strategy (DP-420 Key Concepts)

  1. Preparation:

    • Partition Key: Ensure the partition key in the target container is optimized for the new workload.
    • Data Modeling: Denormalize data if moving from Relational to NoSQL.
    • RU Provisioning: Scale up RUs on the target container (temporarily) during the migration to avoid 429 (Rate Limiting) errors.
  2. Validation:

    • Start with a subset of data to validate partition key performance and query patterns.
    • Verify document integrity and count after migration.
  3. Post-Migration:

    • Scale RUs back down to normal operational levels.
    • Configure TTL, Indexing, and consistency levels as required.

4. Specific Scenarios

  • Migrating from MongoDB: Use DMS (Online) or native tools like mongodump/mongorestore (Offline).
  • Migrating from SQL Server: Use ADF or the Data Migration Tool.
  • Migrating Large Blobs/JSON: Use ADF with a focus on partitioning source files.