MongoDB Cosmos DB Writer
Writes to Cosmos DB using the Azure Cosmos DB API for MongoDB version 3.6 or 4.0, allowing you to write to a CosmosDB target as if it were a MongoDB target. For general information, see Azure Cosmos DB API for MongoDB and Connect a MongoDB application to Azure Cosmos DB.
Azure Cosmos DB API for MongoDB 3.2 is not supported.
Note
If the writer exceeds the number of Request Units per second provisioned for your Cosmos DB instance (see Request Units in Azure Cosmos DB), the application may halt. The Azure Cosmos DB Capacity Calculator can give you an estimate of the appropriate number of RUs to provision:
You may need more RUs during initial load than for continuing replication.
See Optimize your Azure Cosmos DB application using rate limiting and Prevent rate-limiting errors for Azure Cosmos DB API for MongoDB operations for more information.
Using the adapter
This adapter may be used in four ways:
With an input stream of a user-defined type, MongoDB CosmosDB Writer writes events as documents to a single Cosmos DB collection.
Target document field names are taken from the input stream's event type.
The value of the key field of the input event is used as the document key (
_id
field value). If the input stream's type has no key, the target document's key is generated by concatenating the values of all fields, separated by the Key Separator string. Alternatively, you may specify a subset of fields to be concatenated using the syntax<database name>.<collection name> keycolumns(<field1 name>, <field2 name>, ...)
in the Collections property.With an input stream of type JSONNodeEvent that is the output stream of a source using JSONParser, MongoDB Cosmos DB Writer writes events as documents to a single Cosmos DB collection.
Target document field names are taken from the input events' JSON field names.
When the JSON event contains an
_id
field, its value is used as the document key. Otherwise, Cosmos DB will generate an ObjectId for the document key.With an input stream of type JSONNodeEvent that is the output stream of a MongoDB Reader source, MongoDB Cosmos DB Writer writes each MongoDB collection to a separate Cosmos DB collection.
MongoDB collections may be replicated in a Cosmos DB instance by using wildcards in the Collections property. Alternatively, you may manually map source collections to target collections as discussed in the notes for the Collections property.
The source document's primary key and field names are used as the target document's key and field names.
With an input stream of type WAEvent that is the output stream of a SQL CDC reader or Database Reader source, MongoDB Cosmos DB Writer writes data from each source table to a separate collection. The target collections may be in different databases. In order to process updates and deletes, compression must be disabled in the source adapter (that is, WAEvents for update and delete operations must contain all values, not just primary keys and, for updates, the modified values)..
Each row in a source table is written to a document in the target collection mapped to the table. Target document field names are taken from the source event's metadata map and their values from its data array (see WAEvent contents for change data).
Source table data may be replicated to Cosmos DB collections of the same names by using wildcards in the Collections property. Note that data will be read only from tables that exist when the source starts. Additional tables added later will be ignored until the source is restarted. Alternatively, you may manually map source tables to Cosmos DB collections as discussed in the notes for the Collections property. When the source is a CDC reader, updates and deletes in source tables are replicated in the corresponding Cosmos DB target collections.
Each source row's primary key value (which may be a composite) is used as the key (
_id
field value) for the corresponding Cosmos DB document. If the table has no primary key, the target document's key is generated by concatenating the values of all fields in the row, separated by the Key Separator string. Alternatively, you may select a subset of fields to be concatenated using theKeyColumns
option as discussed in the notes for the Collections property.Cosmos DB limits the number of characters allowed in document IDs (see Per-item limits in Microsoft's documentation). When using wildcards or
keycolumns
, be sure that the generated document IDs will not exceed that limit.
MongoDB Cosmos DB Writer properties
property | type | default value | notes |
---|---|---|---|
Batch Policy | String | EventCount:1000, Interval:30 | The batch policy includes eventCount and interval (see Setting output names and rollover / upload policies for syntax). Events are buffered locally on the Striim server and sent as a batch to the target every time either of the specified values is exceeded. When the app is stopped, any remaining data in the buffer is discarded. To disable batching, set to With the default setting, data will be written every 30 seconds or sooner if the buffer accumulates 1,000 events. |
Collections | String | The fully-qualified name(s) of the CosmosDB collection(s) to write to, for example, mydb.mycollection. Separate multiple collections by commas. You may use the % wildcard, for example, mydb.%. Note that data will be written only to collections that exist when the Striim application starts. Additional collections added later will be ignored until the application is restarted. When the input stream of the target is the output of a DatabaseReader, IncrementalBatchReader, or SQL CDC source, it can write to multiple collections. In this case, specify the names of both the source tables and target collections ( | |
Connection Retry | String | retryInterval=60, maxRetries=3 | With the default setting, if a connection attempt is unsuccessful, the adapter will try again in 30 seconds ( |
Connection URL | String | Specify | |
Excluded Collections | String | Any collections to be excluded from the set specified in the Collections property. Specify as for the Collections property. | |
Ignorable Exception Code | String | By default, if the target returns an error, the application will terminate. Specify DUPLICATE_KEY, KEY_NOT_FOUND, or NO_OP_UPDATE to ignore such errors and continue. To specify both, separate them with a comma. Ignored exceptions will be written to the application's exception store (see CREATE EXCEPTIONSTORE). | |
Key Separator | String | : | Inserted between values when generating document keys by concatenating column or field values. If the values might contain a colon, change this to something that will not occur in those values. |
Ordered Writes | Boolean | True | If you do not care that documents may be written out of order (typically the case during initial load), set to False to improve performance. |
Overload Retry Policy | String | retryInterval=1, maxRetries=10 | With the default setting, if CosmosDB rejects a write because it exceeds the throughput limit, the adapter will try again in one second ( Negative values are not supported. See Prevent rate-limiting errors for Azure Cosmos DB API for MongoDB operations. |
Parallel Threads | Integer | ||
Password | com. webaction. security. Password | The password for the specified Username. | |
Retriable Error Codes | String | {"ThrottlingErrorCodes" : [16500,50]} | Specify any error codes for which you want to trigger a connection retry or overload retry rather than a halt or termination. The default value For information about these and other MongoDB error codes, see Common errors and solutions. |
Upsert Mode | Boolean | False | Set to True to process inserts and updates as upserts. This is required if the input stream of this writer is a Cosmos DB Reader JSONNodeEvent stream. |
Username | String | A MongoDB user with the readwrite role on the target collection(s). |