Skip to content

Integration

Using cosmosDB as source or sink with other tools. Exam is to test your knowledge in using links (unless you are really familiar with the tools, even if you have integrated before; the chances of remembering the main keys are low). Question I encountered are like dropdowns for a Data Factory for sink and fill in the type and write behavior.

Data Factory

https://learn.microsoft.com/en-us/azure/data-factory/connector-azure-cosmos-db?context=%2Fazure%2Fcosmos-db%2Fcontext%2Fcontext&tabs=data-factory

What's required is to set is service link(from data factory).

"sink": {
    "type": "CosmosDbSqlApiSink",
    "writeBehavior": "upsert"
}

Kafka

https://learn.microsoft.com/en-us/azure/cosmos-db/kafka-connector-v2

Spark

Can read and write https://learn.microsoft.com/en-us/azure/cosmos-db/tutorial-spark-connector?pivots=programming-language-python

just take note oltp and olap. - oltp - is transaction meaning read/write directly to cosmosdb - olap - only as read and effectively uses spark analytic directly.

There are two options to query data from Azure Cosmos DB for NoSQL.

# to some Azure account, else use spark.synapse.endpoint
productsDataFrame = spark.read.format("cosmos.olap") \
    .option("spark.synapse.linkedService", "cosmicworks_serv") \
    .option("spark.cosmos.container", "products") \
    .load()

Or

create table products_qry using cosmos.olap options (
    spark.synapse.linkedService 'cosmicworks_serv',
    spark.cosmos.container 'products'
)