Skip to content

Bulk Execution

Similar to Transactional Batch, except that: - PartitionKey is mandatory, but can be grouped or seperated, meaning you can have different partition key in the same request. (In MS learn it says optional but in actual implementation none of the SDK provides this as optional) - No atomicity. If one fail, it will continue. - Limit of 2MB is per request, not entire batch. - Need to enable AllowBulkExecution in CosmosClientOption for Java and C# - Combination of CreateItemAsync/UpdateItemAsync/DeleteItemAsync Then using WhenAll(tasks)

CosmosClientOptions options = new () 
{ 
    AllowBulkExecution = true 
};

List<Product> productsToInsert = GetOurProductsFromSomeWhere();

List<Task> concurrentTasks = new List<Task>();

foreach(Product product in productsToInsert)
{
//Look carefully, this is just normal create statement.
    concurrentTasks.Add(
        container.CreateItemAsync<Product>(
            product, 
            new PartitionKey(product.partitionKeyValue))
    );
}

Task.WhenAll(concurrentTasks);
    const saddle = { id: "0120", name: "Worn Saddler", categoryId: "9603ca6c-9e28-4a02-9194-51cdb7fea816" };
    const handlebar = { id: "012A", name: "Rusty Handlebarr", categoryId: "9603ca6c-9e28-4a02-9194-51cdb7fea816" };
    const helmet = { id: "012C", name: "New Helmet", categoryId: "2202234-9e28-4a02-9194-51cdb7fea816" };

    const partitionKey = saddle.categoryId;

    // Create items
    const batchOperations = [
        { operationType: "Create", resourceBody: saddle, partitionKey: saddle.categoryId },
        { operationType: "Create", resourceBody: helmet, partitionKey: helmet.categoryId },
        { operationType: "Create", resourceBody: handlebar, partitionKey: handlebar.categoryId }
    ];

    const bulkResponse = await container.items.executeBulkOperations(batchOperations.map(op => ({
        ...op,
// partitionKey // IMPORTANT: This is optional, if item contains partition key then don't need this but if it's here ALL the items will use this partition key. Either in item/executeBulkOperation must have partition key.
    })));
    console.log("Bulk create response:", bulkResponse);

Drawbacks

Drawbacks Description
Latency The SDK batches operations internally and waits to dispatch them until the batch is full or a timeout (typically 100ms) is reached.
Only small size Either 2MB or 10records and maximum 5 seconds
Higher throughput Higher RU/s compared to running query statement 1 by 1

Best practices

https://learn.microsoft.com/en-sg/training/modules/process-bulk-data-azure-cosmos-db-sql-api/4-implement-bulk-best-practices

  1. Not always required to add partition key in class. BUT this statement is misleading as it means if your class contains partition key, then you do not need to add it; the code can find the partition key from the class. But it's slower as it iterate through the object. Also I couldn't find any SDK that support this.
  2. Use worker task (for API) to configure parallelism based on partition key. Worker is a thread.
  3. Use stream when possible, meaning not to deserialize the entire batch into memory.

Results

Error results will be returned in the response.

const operations = [
    {
      operationType: 'Create',
      resourceBody: { id: 'item1', name: 'Product A', categoryId: 'Electronics' },
      partitionKey: 'Electronics',
    },
    {
      operationType: 'Create',
      resourceBody: { id: 'item2', name: 'Product B', categoryId: 'Clothing' },
      partitionKey: 'Clothing',
    },

    {
      operationType: 'Create',
      resourceBody: { id: 'item3', name: 'Product C', categoryId: 'Books' },
      partitionKey: 'Books',
    },
    {
      operationType: 'Upsert',
      resourceBody: { id: 'item1', name: 'Product B', price: 100 },
      partitionKey: 'Clothing',
    },
    {
      operationType: 'Upsert',
      resourceBody: { id: 'item2', name: 'Product B', price: 100, categoryId: 'Clothing' },
      partitionKey: 'Clothing',
    },
    {
      operationType: 'Upsert',
      resourceBody: { id: 'item2', name: 'Product B', price: 100, categoryId: 'Clothing' },
    },
    {
      operationType: 'Delete',
      id: 'item3',
      partitionKey: 'Books',
    },
  ];
{ Bulk operations completed: [
  {
    operationInput: {
      operationType: 'Create',
      resourceBody: [Object],
      partitionKey: 'Electronics'
    },
    error: {
      message: undefined,
      code: 409,
      substatus: undefined,
      body: undefined,
      headers: [Object],
      activityId: 'b4c8fa6b-4113-458f-9274-c9b16b083905',
      retryAfterInMs: undefined,
      retryAfterInMilliseconds: undefined,
      diagnostics: [CosmosDiagnostics],
      requestCharge: 1.5714285714285714
    }
  },
  {
    operationInput: {
      operationType: 'Create',
      resourceBody: [Object],
      partitionKey: 'Clothing'
    },
    error: {
      message: undefined,
      code: 409,
      substatus: undefined,
      body: undefined,
      headers: [Object],
      activityId: 'b4c8fa6b-4113-458f-9274-c9b16b083905',
      retryAfterInMs: undefined,
      retryAfterInMilliseconds: undefined,
      diagnostics: [CosmosDiagnostics],
      requestCharge: 1.5714285714285714
    }
  },
  {
    operationInput: {
      operationType: 'Create',
      resourceBody: [Object],
      partitionKey: 'Books'
    },
    response: {
      statusCode: 201,
      eTag: '"00000bb2-0000-5900-0000-68d879a30000"',
      activityId: 'b4c8fa6b-4113-458f-9274-c9b16b083905',
      sessionToken: '0:-1#8',
      requestCharge: 6.285714285714286,
      resourceBody: [Object],
      diagnostics: [CosmosDiagnostics],
      headers: [Object]
    }
  },
  {
    operationInput: {
      operationType: 'Upsert',
      resourceBody: [Object],
      partitionKey: 'Clothing'
    },
    error: {
      message: undefined,
      code: 400,
      substatus: undefined,
      body: undefined,
      headers: [Object],
      activityId: 'b4c8fa6b-4113-458f-9274-c9b16b083905',
      retryAfterInMs: undefined,
      retryAfterInMilliseconds: undefined,
      diagnostics: [CosmosDiagnostics],
      requestCharge: 1.2380952380952381
    }
  },
  {
    operationInput: {
      operationType: 'Upsert',
      resourceBody: [Object],
      partitionKey: 'Clothing'
    },
    response: {
      statusCode: 200,
      eTag: '"00000db2-0000-5900-0000-68d879a30000"',
      activityId: 'b4c8fa6b-4113-458f-9274-c9b16b083905',
      sessionToken: '0:-1#8',
      requestCharge: 10.285714285714286,
      resourceBody: [Object],
      diagnostics: [CosmosDiagnostics],
      headers: [Object]
    }
  },
  {
    operationInput: { operationType: 'Upsert', resourceBody: [Object] },
    error: {
      message: 'PartitionKey is required for Upsert operations.',
      code: 500,
      substatus: undefined,
      body: undefined,
      headers: undefined,
      activityId: undefined,
      retryAfterInMs: undefined,
      retryAfterInMilliseconds: undefined,
      diagnostics: undefined,
      requestCharge: undefined
    }
  },
  {
    operationInput: { operationType: 'Delete', id: 'item3', partitionKey: 'Books' },
    response: {
      statusCode: 204,
      eTag: undefined,
      activityId: 'b4c8fa6b-4113-458f-9274-c9b16b083905',
      sessionToken: '0:-1#8',
      requestCharge: 6.095238095238095,
      resourceBody: undefined,
      diagnostics: [CosmosDiagnostics],
      headers: [Object]
    }
  }
]
}

Transactional Batch vs Bulk Execution

In the Cosmos DB .NET SDK, there is a feature called Bulk Execution (AllowBulkExecution = true).

If you throw 10,000 operations at the SDK in Bulk mode, the SDK automatically groups them under the hood and splits them into compliant physical requests (handling the 2MB / 100 operation limits automatically for you). Stored procedures cannot do this automatically, so Bulk Execution truly "avoids" you having to worry about the 2MB limitation when uploading lots of data. (Note: Bulk execution is not a single ACID transaction, unlike Transactional Batch). Summary for your notes: I would recommend updating your notes to be more accurate:

Transactional Batch: Replaces Stored Procedures for ACID transactions (making multiple changes succeed/fail together on the same partition key) without needing to write server-side JavaScript. It still has a 100 item / 2MB payload limit. Bulk Execution: Groups multiple independent operations automatically to maximize throughput. This handles the chunking for you, avoiding the need to worry about the 2MB physical request limit per batch.