Skip to content

Stored Procedures

Transactional Context with Stored Procedure

https://learn.microsoft.com/en-us/training/modules/build-multi-item-transactions-azure-cosmos-db-sql-api/2-understand-transactions

  1. Can only process in ONE/1 logical partition. So partition key is required. If insert or update has different partition key - possible if the field is a partition key with different value it will fail.
  2. Process can be batch(true/false) or bulk(throw error or all true) like. To ROLLBACK we send throw error. See context section below.
  3. It is bounded by RU, execution of 5 seconds time and 2MB limit.
  4. In exam, search for Stored Procedure, not Store.
  5. The path to store procedure is $"dbs/{databaseName}/colls/{containerName}/sprocs/{sprocName}"

Javascript Only

Stored Proc is written only in javascript. __ (double underscore) shortcut as an equivalent to getContext().getCollection().

string sproc = @"function greet() {
    var context = getContext();
    var response = context.getResponse();
    response.setBody('Hello, Learn!');
}";

StoredProcedureProperties properties = new()
{
    Id = "greet",
    Body = sproc
};

await container.Scripts.CreateStoredProcedureAsync(properties);

Context

To set response

function greet() {
    var context = getContext();
    var response = context.getResponse();
    response.setBody("Hello, Learn!");
}

To get collection

function createProduct(item) {
    var context = getContext();
    var container = context.getCollection(); 
}

To handle responses...e.g. see pointer retry

Image

function createProduct(item) {
    var context = getContext();
    var container = context.getCollection(); 
    var accepted = container.createDocument(
        container.getSelfLink(),
        item,
        (error, newItem) => {
            if (error) throw error;                                    //Creates rollback
            context.getResponse().setBody(newItem) //Updates record
        }
    );
/* This is optional, the reason of having this return is to stop processing any codes after this line and exit */
     // Even if this is the last line, keep this. 
    // It protects you if the system is too busy to queue the write.
    if (!accepted) return; 
}
  • Container.createDocument(...): - This is the line that actually creates the new record (document) in the Cosmos DB container. The creation happens asynchronously within the stored procedure's execution.
  • The callback function (error, newItem) => { ... } runs after the document has been successfully created.
  • newItem: This is the object that was just successfully saved into the database, including the auto-generated properties like the document's _id and metadata.
  • context.getResponse().setBody(newItem): This takes that newly created document (newItem) and places it into the response object. Your Kotlin application, when it receives the response from the stored procedure call, will see this newItem as the result. This is a common pattern for returning the complete, saved document to the client.
  • accepted = true: The operation (the document creation) was successfully queued and the callback function is expected to run when the document is created.
  • accepted = false: The operation could not be queued because the stored procedure is running out of time, has consumed too many Request Units (RUs), or is approaching the server's execution limit.
  • throw exception is Atomicity. The entire transaction is rolled back. Any changes made before the failure (including the successful insertion of the first item) are undone. The stored procedure terminates immediately.
  • With the if statement: If accepted is false, the procedure gracefully exits. It returns the number of successfully processed items to the client, which allows the client to resume the batch from the next item.
  • Without the if statement: The procedure runs until the server terminates it by throwing an error. Because an error occurred, the entire transaction is rolled back, and the client receives a failure status. The client has no idea how many items (if any) were processed, and it often means the client must retry the entire batch, leading to wasted RUs and potential performance issues.

Partition Key based

Stored procedures are scoped to a single logical partition. You cannot execute a stored procedure that performs operations across logical partition key values.

When you execute store procedure, you must set "PARTITION KEY"

CosmosStoredProcedureResponse response = container.getScripts()
    .getStoredProcedure("createProduct")
    .execute(storedProcedureArgs, new CosmosStoredProcedureRequestOptions(new PartitionKey("YourPartitionKeyValue")));

Cost RU deduction in advance

Azure Cosmos DB has a different charging policy for stored procedures. Because stored procedures can execute code and consume any number of request units (RUs), each execution requires an upfront charge. This ensures that stored procedure scripts don't affect backend services. The amount charged upfront equals the average charge consumed by the script in previous invocations. The average RUs per operation is reserved before execution. If the invocations have much variance in RUs, your budget utilization might be affected. As an alternative, you should use batch or bulk requests instead of stored procedures to avoid variance around RU charges.

5 seconds rule

All Azure Cosmos DB operations must finish within the specified timeout duration. Stored procedures have a timeout limit of 5 seconds. This constraint applies to JavaScript functions – stored procedures, triggers, and user-defined functions. If an operation isn't completed within that time limit, the transaction is rolled back.

If want to overcome, use a continuation logic method.

Notes

Atomic transactions: Azure Cosmos DB database operations that are performed within a single stored procedure or a trigger are atomic. This atomic functionality lets an application combine related operations into a single batch, so that either all of the operations succeed or none of them succeed

Partition based: For partitioned containers, when executing a stored procedure, a partition key value must be provided in the request options. Stored procedures are always scoped to a partition key. Items that have a different partition key value aren't visible to the stored procedure. This also applies to triggers. Way to overcome this, use change feed or run 2 diff procedures.

Server-side JavaScript features, including stored procedures, triggers, and UDFs, don't support importing modules.

Logging: console.log(responseHeaders[Constants.HttpHeaders.ScriptLogResults]);

Vector

Vector Policy

Once the Vector Search feature is enabled on your Azure Cosmos DB for NoSQL account, you must define a vector embedding policy for the containers where you want to store vectors. This policy informs the Azure Cosmos DB query engine how to handle vector properties in the VectorDistance system function. The following information is included in the container vector policy:

path: The path of the property containing the vector embeddings. datatype: The type of the elements in the vector. The default is Float32. dimensions: This property is the number of dimensions in or length of each vector and will be driven by the model used to create embeddings. distanceFunction: The technique used to compute distance or similarity between vectors. The available options are Euclidean (default), cosine, and dot product.

"vectorEmbeddingPolicy": {
    "vectorEmbeddings": [
        {
            "path": "/vectorField",
            "dataType": "float32",
            "distanceFunction": "cosine",
            "dimensions": 1536
        }
    ]
}

Vector indexing

Not compulsory but best created.

Type Description Max dimensions
flat Stores vectors on the same index as other indexed properties. 505
quantizedFlat Quantizes (compresses) vectors before storing on the index. This policy can improve latency and throughput at the cost of a small amount of accuracy. 4096
diskANN Creates an index based on DiskANN for fast and efficient approximate search. 4096
Feature Flat QuantizedFlat DiskANN
Search Type Exact (Brute-force) Exact (on compressed data) Approximate (Graph-based)
Accuracy (Recall) 100% (Perfect) ~99% (High) ~95% - 99% (Tunable)
Speed Slow (Linear) Fast Fastest (Logarithmic)
RAM Usage Very High Low Lowest (Offloads to SSD)
Best For... < 10k vectors 10k – 50k vectors 50k – Billions of vectors
"indexingPolicy": {
    "vectorIndexes": [
        {
            "path": "/vectorField",
            "type": "diskANN"
        }
    ]
}

Perform vector search using the VectorDistance function

Azure Cosmos DB for NoSQL supports the creation of vector search indexes on top of stored embeddings.

The VectorDistance function in Azure Cosmos DB for NoSQL measures the similarity between two vectors by calculating their distance using metrics like cosine similarity, dot product, or Euclidean distance. This function is vital for applications that require quick and accurate similarity searches, such as those involving natural language processing or recommendation systems. By utilizing VectorDistance, you can efficiently handle high-dimensional vector queries, significantly improving the relevance and performance of your AI-driven applications.

Notes

  1. MUST always create new container after feature is enabled.
  2. The vector search feature is currently not supported on the existing containers, so you need to create a new container and specify the container-level vector embedding policy and the vector indexing policy at the time of container creation.