Overview
Historically it was called DocumentDB, due to marketing it was renamed. Roles and some APIs still uses DocumentDB.
Database Types
There are 5 types of database in Azure Cosmos DB:
- SQL (Core) API (exam primarily focus on this)
- MongoDB API
- Cassandra API
- Gremlin API
- Table API
Hierarchy
An "Account" is the primary distribution/availability.
Account (replication, consistency)
-> Database (throughput)
-> Container (all throughput)
-> Item
Scope of Control
| Scope of Control | Replication (Regions) | Consistency Level (Default) | Consistency Level (Override) |
|---|---|---|---|
| Account Level | YES (Add/Remove Regions, Single/Multi-write setting) | YES (Sets the default for everything) | NO (Only for individual requests) |
| DB / Container Level | NO | NO | NO |
| Request Level | NO | NO | YES (Can relax the account default for reads) |
Capacity Mode
Serverless
- Serverless compute integration
- ONLY max 1TB per container
- NO Geo replication
- Performance on write slower, 30ms vs 10ms on write. Read is 10ms on both Provisioned and Serverless.
- Billing is pay per-use compared to per-hour.
- Cannot do periodic backup only continuous backup.
- MAX at 5000 RU per physical partition in a container.
Provisioned throughput
- Unlimited storage.
- Split to:
- Manual
- much cheaper 0.8 per RU/s
- Autoscale
- expensive at 0.12 per RU/s
- billing is charged per/hour on the highest/max
Switching Capacity Mode
You can switch a Cosmos DB account from Serverless to Provisioned Throughput (either Manual or Autoscale). You cannot switch a Cosmos DB account back from Provisioned Throughput to Serverless. This is an irreversible operation.
Differences of VCore/RU and Managed Instance
Just glance thru, the exam focuses more on RU. Difference between RU and Managed/VCore is replication:
| Feature | vCore-based Model (Dedicated Instances) | RU-based Model (Shared Throughput) |
|---|---|---|
| In-Region HA | Manual/Synchronous. You must explicitly enable High Availability (HA) for production clusters. When enabled, each shard maintains a hot-standby replica in a different Availability Zone (AZ). Replication between primary and standby is synchronous, guaranteeing zero data loss on failover. | Automatic/Synchronous (Behind the scenes). Data is automatically replicated across fault domains and upgrade domains within a region. The service manages this process transparently to provide high availability. |
| Cross-Region Replication | Active-Passive (Asynchronous). You create a read-only replica cluster in a different region. Data is replicated asynchronously from the primary cluster to the replica. You must manually promote the replica to become the new primary in the event of a regional disaster (Disaster Recovery). | Active-Active (Synchronous/Asynchronous). You can configure multi-region writes (active-active) where every region can accept writes. The internal replication is handled by the platform, offering a choice of five consistency levels (Strong, Bounded Staleness, Session, Consistent Prefix, Eventual). |
| Sharding | Supports sharding to scale the cluster horizontally with a vCore-based architecture familiar to native MongoDB users. | Supports automatic, server-side sharding for "limitless" horizontal scalability, with the service managing shard creation and balancing. |
| Disaster Recovery | Requires manual promotion of a cross-region replica. As replication is asynchronous, there is a possibility of minimal data loss if the primary region fails before the last few writes are replicated. | Offers automatic regional failover and recovery with no downtime for multi-region write accounts, depending on the chosen consistency level. |
| Consistency Control | Adheres to the consistency models of the underlying MongoDB architecture (typically Strong Consistency within a replica set/shard, but Eventual Consistency for cross-region reads). | Offers five tunable consistency levels (Strong, Bounded Staleness, Session, Consistent Prefix, and Eventual) to balance performance and data fidelity. |
VCore
VCore are only featured for MongoDB and PostgreSQL PostgreSQL is only vCore (no RU) where a fixed vCPU/RAM is assigned.
| Feature | vCore-based Model | RU-based Model (Provisioned/Autoscale/Serverless) |
|---|---|---|
| Primary Resource | Dedicated compute resources (vCPUs, RAM, and Storage). | Throughput expressed in Request Units (RUs) per second. |
| Billing | Consistent flat fee based on provisioned vCores/nodes and storage. | Based on RUs: either a fixed amount (Provisioned) or RUs consumed (Serverless). |
| Scaling | Vertical and horizontal scaling of cluster tiers (vCPUs/RAM/Storage). | Limitless horizontal scaling based on RUs and data size. |
| Workload Focus | Predictable performance, complex queries, and lift-and-shift migrations. | Cloud-native apps, high-volume point reads, and instantaneous scaling. |
| Availability | Typically offers 99.995% SLA. | Offers an industry-leading 99.999% SLA for mission-critical apps. |
Managed Instance
Cassandara has Managed Instance(Virtual Machine to run so need choose SKU) and RU(choose throughput) only.
| Feature | Azure Managed Instance for Apache Cassandra | Azure Cosmos DB for Apache Cassandra (RU/s) |
|---|---|---|
| Underlying Platform | Pure open-source Apache Cassandra. You can use native features. | Azure Cosmos DB's cloud-native engine, with a Cassandra-compatible API layer. |
| Pricing Model | Instance-based (billed by the number and size of Virtual Machines/nodes). | Throughput-based (billed by Request Units per second or RU/s). |
| Compatibility | 100% feature-compatible with native Apache Cassandra. Best for lift-and-shift. | High wire-protocol compatibility, but behaves differently (e.g., no compaction). |
| Scaling | Scale by adding or removing VM nodes; scaling is managed but involves VM-level operations. | Scales elastically and instantly by increasing or decreasing the provisioned RU/s. |
Concept of VCore / Managed Instance
| Concept | Explanation |
|---|---|
| A Managed Instance is defined by vCores. | When you provision a Managed Instance, you must select the vCore purchasing model. You then choose the number of vCores (e.g., 8 vCores, 16 vCores) to power that instance. |
| vCores are the engine; the Managed Instance is the car. | The Managed Instance provides the feature set (the "SQL Server experience"), and the vCores provide the performance (the CPU and RAM). |
Shared Throughput
After the first 25 containers, you can add more containers to the database only if they're provisioned with dedicated throughput, which is separate from the shared throughput of the database.

| Parameter | Standard (manual) throughput on a database | Standard (manual) throughput on a container | Autoscale throughput on a database | Autoscale throughput on a container |
|---|---|---|---|---|
| Entry point (minimum RU/s) | 400 RU/s. Can have up to 25 containers with no RU/s minimum per container. | 400 | Autoscale between 100 - 1000 RU/s. Can have up to 25 containers with no RU/s minimum per container. | Autoscale between 100 - 1000 RU/s. |
| Minimum RU/s per container | -- | 400 | -- | Autoscale between 100 - 1000 RU/s |
| Maximum RUs | Unlimited, on the database. | Unlimited, on the container. | Unlimited, on the database. | Unlimited, on the container. |
| RUs assigned or available to a specific container | No guarantees. RUs assigned to a given container depend on the properties. Properties can be the choice of partition keys of containers that share the throughput, the distribution of the workload, and the number of containers. | All the RUs configured on the container are exclusively reserved for the container. | No guarantees. RUs assigned to a given container depend on the properties. Properties can be the choice of partition keys of containers that share the throughput, the distribution of the workload, and the number of containers. | All the RUs configured on the container are exclusively reserved for the container. |
| Maximum storage for a container | Unlimited | Unlimited | Unlimited | Unlimited |
| Maximum throughput per logical partition of a container | 10K RU/s | 10K RU/s | 10K RU/s | 10K RU/s |
| Maximum storage (data + index) per logical partition of a container | 20 GB | 20 GB | 20 GB | 20 GB |
FREE Tier
Yes, there is a Lifetime Free Tier You must opt-in: You must explicitly choose the Free Tier discount when you create the Azure Cosmos DB account.
One per subscription: You are limited to one Free Tier account per Azure subscription.
Duration: The free tier lasts for the lifetime of that specific Cosmos DB account. It does not expire after 12 months like the general Azure Free Account.
| Resource | Free Tier Limit | Cost Implication |
|---|---|---|
| Throughput (RU/s) | The first 1,000 RU/s provisioned. | FREE |
| Storage | The first 25 GB of data storage. | FREE |
Only first 1k and 25G, if exceed will charge as normal.
Autoscale provisioning
Why not manual since auto is great: 1. more expensive 0.12. 2. billing is charged per/hour on the highest/max. E.g. if normally you use only 4k RU but one second peak of 6k the whole hour charge is 6k RU/s.
How to set the maximum autoscale
- Min is always set 10% of the max. So 1000 RU is 100 minimum RU.
- Setting highest like 99999 have implications. E.g. You set 10k max RU, if your average RU is using only 4k is actually used but you are always billed at 10k. It's because the buffer allow too much spike and even 1 second 10k your bill goes insane expensive. The best is to set about 4.5k or 5k and billing can be saved.
Limit
This is as of 2026
| Resource | Your Value | Current 2026 Limit | Note |
|---|---|---|---|
| Account (per Sub) | 50 | 250 | The default is now 250, but it can be increased via support ticket to 1,000. |
| Databases + Containers | 500 / 25 | 500 Total | This is a combined limit per account (e.g., 10 DBs and 490 Containers). |
| Containers (Shared DB) | 25 | 25 | Only applies to databases using Shared Throughput. Dedicated containers are unlimited (up to the 500 account limit). |
| Max RU per Partition | 10,000 | 10,000 RU/s | This applies to both Logical and Physical partitions. |
| Logical Partition Size | 20GB | 20GB | This is a hard limit. Exceeding this triggers a "Partition Key reached maximum size" error. |
| Physical Partition Size | 50GB | 50GB | When a physical partition hits 50GB, Cosmos DB automatically performs a "Partition Split." |