Skip to main content

Cosmos DB continuous replication using CDC

Use Cosmos DB Reader or Mongo Cosmos DB Reader in Incremental mode to replicate Cosmos DB data to a target.

Continuous replication using Cosmos DB Reader

Cosmos DB Reader reads documents from one or more Cosmos DB containers using Cosmos DB's native Core (SQL) API. Its output stream type is JSONNodeEvent, so it requires targets that can read JSONNodeEvents. Alternatively, you may convert the output to a user-defined type (see Converting JSONNodeEvent output to a user-defined type for an example).

This reader sends both inserts and updates as inserts. This means that to support replicating Cosmos DB documents the writer must support upsert mode. In upsert mode, a new document (one whose id field does not match that of any existing document) is handled as an insert and an update to an existing documents (based on matching id fields) is handled as an update. For replication, this limits the choice of writers to Cosmos DB Writer and Mongo Cosmos DB Writer. Append-only targets such as files, blobs, and Kafka are also supported so long as they can handle a JSONNodeEvent input stream.

Be sure to provision sufficient Request Units (see Request Units in Azure Cosmos DB) to handle the volume of data you expect to read. If you do not, the reader be unable to keep up with the source data.

Continuous replication using Mongo Cosmos DB Reader

Mongo Cosmos DB Reader reads documents from one or more Cosmos DB containers using the Mongo Java driver (bundled with Striim). Its output stream type is JSONNodeEvent, so it requires targets that can read JSONNodeEvents. Alternatively, you must convert the output to a user-defined type (see Converting JSONNodeEvent output to a user-defined type for an example).

Only request unit database accounts are supported, vCore clusters are not supported. Azure DocumentDB is not supported. Azure Cosmos DB API for MongoDB 3.2 is not supported.

This reader sends both inserts and updates as inserts. This means that to support replicating Cosmos DB documents the writer must support upsert mode. In upsert mode, a new document (one whose _id field does not match that of any existing document) is handled as an insert and an update to an existing documents (based on matching _id fields) is handled as an update. For replication, this limits the choice of writers to Cosmos DB Writer and Mongo Cosmos DB Writer. Append-only targets such as files, blobs, and Kafka are also supported so long as they can handle a JSONNodeEvent input stream.

Be sure to provision sufficient Request Units (see Request Units in Azure Cosmos DB) to handle the volume of data you expect to read. If you do not, the reader be unable to keep up with the source data.