Cosmos DB initial setup
If you will use Cosmos DB Reader, see Cosmos DB setup for Cosmos DB Reader.
If you will use Mongo Cosmos DB Reader, see Cosmos DB setup for Mongo Cosmos DB Reader.
The following discussions of networking and security apply to both initial load and continuous replication using either Cosmos DB Reader or Mongo Cosmos DB Reader.
Networking setup
The following applies to both initial load and continuous replication.
You need to establish proper network connectivity between your Striim environment and Cosmos DB. This involves configuring network access, firewall rules, and connection parameters to ensure reliable communication.
Ensure that the Striim server can connect to Cosmos DB. You need to configure security groups to allow access from your Striim instance.
Also consider network latency and bandwidth requirements, especially for high-volume CDC scenarios. For optimal performance, minimize the network latency between Striim and Cosmos DB.
Cosmos DB requires secure SSL/TLS encryption between Striim and Cosmos DB. This requires configuring Striim to use SSL certificates.
Security
Security configuration for Cosmos DB integration involves multiple layers, including authentication, authorization, network security, and data protection measures.
You must implement proper authentication mechanisms between Striim and Cosmos DB. This includes creating dedicated database users with minimal required privileges following the principle of least privilege. You should avoid using administrative accounts and instead create specific users for Striim operations with only the necessary permissions for the tables and operations required.
You should implement access control at multiple levels, including database-level permissions, schema-level access controls, and table-level privileges. You should regularly review and audit the permissions granted to Striim users and implement proper password policies and rotation procedures for service accounts.
Cosmos DB setup for Cosmos DB Reader
When creating a Striim application with a Cosmos DB Reader source, you will need to provide the Primary Key or Secondary Key from the Keys read-only tab of your Cosmos DB account.
Request Units
Provision sufficient Request Units to handle the volume of data you expect to read. For more information, see Request Units in Azure Cosmos DB.
Capturing deletes
Azure Cosmos DB's change feed does not capture deletes. To work around this limitation:
Set time-to-live (TTL) to
-1on the container(s) to be read as described in Learn / Azure / Azure Cosmos DB / No SQL / Configure time to live in Azure Cosmos DB. The-1value means Cosmos DB will not automatically delete any documents.Set Cosmos DB Reader's Cosmos DB Config property to:
{"Operations": {"SoftDelete": {"FieldName" : "IsDeleted","FieldValue" : "true"}}}This will enable the
IsDeletedfield for soft-delete operations, as shown by the example in Cosmos DB Reader example output. To delete a document, perform an UPDATE operation that sets isDeleted totrueand sets a positive value for TTL. When theIsDeletedfield value istrue, the OperationName value in the metadata of the output event is DELETE even though the operation is actually an UPDATE.
Cosmos DB setup for Mongo Cosmos DB Reader
When creating a Striim application with a Mongo Cosmos DB Reader source, you will need to provide the Username and Primary Password or Secondary Password from the Connection String read-only tab of your Azure Cosmos DB API for MongoDB account.
SSL
By default, SSL is enabled in Cosmos DB API for Mongo DB. Mongo Cosmos DB Reader uses SSL to connect to Cosmos DB. Other encryption methods are not supported.
Server Side Retry
Server Side Retry is enabled by default for Cosmos DB API for Mongo DB 3.6 and later. Disabling it may result in rate-limiting errors. For more information, see Prevent rate-limiting errors for Azure Cosmos DB API for MongoDB operations.
Capturing deletes
Azure Cosmos DB API for MongoDB's change stream does not capture delete operations. To work around this limitation, do not delete documents directly. Instead, use the following process.
Using the MongoDB shell, create a
_tsindex on the collection withexpireAfterSecondsset to-1. For example:mydb.mycollection.createIndex({"_ts":1}, {expireAfterSeconds: -1})The
-1value means Cosmos DB will not automatically delete any documents, but Mongo Cosmos DB Reader will be able to set the TTL for individual documents in order to delete them. For more information, see Expire data with Azure Cosmos DB's API for MongoDB.Set Mongo Cosmos DB Reader's Cosmos DB Config property to:
{"Operations": {"SoftDelete": {"FieldName" : "IsDeleted","FieldValue" : "true"}}}This will enable the
IsDeletedfield for soft-delete operations, as shown by the example in Mongo Cosmos DB Reader example output. When theIsDeletedfield value istrue, the OperationName value in the metadata of the output event is DELETE even though the operation is actually an UPDATE.
Instead of deleting a document, set IsDeleted to true and ttl to the number of seconds after which Cosmos DB will delete the source document. For a TTL of five seconds, the syntax is:
<database name>.<container name>.updateOne({_id:<document ID>}, {$set : {"IsDeleted":"true", "ttl": 5}})The source document will be deleted in five seconds and the output will include "OperationName":"DELETE".
For example, to soft-delete the following document from the mydb.employee collection:
{
"_id": 1001,
"name": "Kim",
"lastname": "Taylor",
"email": "ktaylor@example.com"
}You would use the command:
mydb.employee.updateOne({_id:1001}, {$set : {"IsDeleted":"true", "ttl": 5}})Immediately after entering that command, the document would be:
{
"_id": 1001,
"name": "Kim",
"lastname": "Taylor",
"email": "ktaylor@example.com",
"IsDeleted":"true",
"ttl":5
}Five seconds after entering the command, the source document would be deleted. The output event would be similar to:
JsonNodeEvent {
data:{
"_id":"1001",
"name": "Kim",
"lastname": "Taylor",
"email": "ktaylor@example.com",
"IsDeleted":"true",
"ttl":5
} metadata:{
"CollectionName":"employee",
"OperationName":"DELETE",
"DatabaseName":"mydb",
"DocumentKey":{"id":"1001"},
"NameSpace":"mydb.employee",
"TimeStamp":1646819488,
"Partitioned":false,
"FullDocumentReceived":true,
"PartitionKeys":{}
} userdata:null
} removedfields:null
};