Striim 4.0.4 documentation

MongoDB Cosmos DB Writer

Writes to Cosmos DB using the Azure Cosmos DB API for MongoDB version 3.6 or 4.0, allowing you to write to a CosmosDB target as if it were a MongoDB target. For general information, see Azure Cosmos DB API for MongoDB and Connect a MongoDB application to Azure Cosmos DB.

Azure Cosmos DB API for MongoDB 3.2 is not supported.

Note

If the writer exceeds the number of Request Units per second provisioned for your Cosmos DB instance (see Request Units in Azure Cosmos DB), the application may halt. The Azure Cosmos DB Capacity Calculator can give you an estimate of the appropriate number of RUs to provision:

CosmosDBRUs.png

You may need more RUs during initial load than for continuing replication.

See Optimize your Azure Cosmos DB application using rate limiting and Prevent rate-limiting errors for Azure Cosmos DB API for MongoDB operations for more information.

This adapter may be used in four ways:

  • With an input stream of a user-defined type, MongoDB CosmosDB Writer writes events as documents to a single Cosmos DB collection.

    Target document field names are taken from the input stream's event type.

    The value of the key field of the input event is used as the document key (_id field value). If the input stream's type has no key, the target document's key is generated by concatenating the values of all fields, separated by the Key Separator string. Alternatively, you may specify a subset of fields to be concatenated using the syntax <database name>.<collection name> keycolumns(<field1 name>, <field2 name>, ...) in the Collections property.

  • With an input stream of type JSONNodeEvent that is the output stream of a source using JSONParser, MongoDB Cosmos DB Writer writes events as documents to a single Cosmos DB collection.

    Target document field names are taken from the input events' JSON field names.

    When the JSON event contains an _id field, its value is used as the document key. Otherwise, Cosmos DB will generate an ObjectId for the document key.

  • With an input stream of type JSONNodeEvent that is the output stream of a MongoDB Reader source, MongoDB Cosmos DB Writer writes each MongoDB collection to a separate Cosmos DB collection.

    MongoDB collections may be replicated in a Cosmos DB instance by using wildcards in the Collections property. Alternatively, you may manually map source collections to target collections as discussed in the notes for the Collections property.

    The source document's primary key and field names are used as the target document's key and field names.

  • With an input stream of type WAEvent that is the output stream of a SQL CDC reader or Database Reader source, MongoDB Cosmos DB Writer writes data from each source table to a separate collection. The target collections may be in different databases. In order to process updates and deletes, compression must be disabled in the source adapter (that is, WAEvents for update and delete operations must contain all values, not just primary keys and, for updates, the modified values)..

    Each row in a source table is written to a document in the target collection mapped to the table. Target document field names are taken from the source event's metadata map and their values from its data array (see WAEvent contents for change data).

    Source table data may be replicated to Cosmos DB collections of the same names by using wildcards in the Collections property. Note that data will be read only from tables that exist when the source starts. Additional tables added later will be ignored until the source is restarted. Alternatively, you may manually map source tables to Cosmos DB collections as discussed in the notes for the Collections property. When the source is a CDC reader, updates and deletes in source tables are replicated in the corresponding Cosmos DB target collections.

    Each source row's primary key value (which may be a composite) is used as the key (_id field value) for the corresponding Cosmos DB document. If the table has no primary key, the target document's key is generated by concatenating the values of all fields in the row, separated by the Key Separator string. Alternatively, you may select a subset of fields to be concatenated using the KeyColumns option as discussed in the notes for the Collections property.

    Cosmos DB limits the number of characters allowed in document IDs (see Per-item limits in Microsoft's documentation). When using wildcards or keycolumns, be sure that the generated document IDs will not exceed that limit.

property

type

default value

notes

Batch Policy

String

EventCount:1000, Interval:30

The batch policy includes eventCount and interval (see Setting output names and rollover / upload policies for syntax). Events are buffered locally on the Striim server and sent as a batch to the target every time either of the specified values is exceeded. When the app is stopped, any remaining data in the buffer is discarded. To disable batching, set to EventCount:1,Interval:0.

With the default setting, data will be written every 30 seconds or sooner if the buffer accumulates 1,000 events.

Collections

String

The fully-qualified name(s) of the CosmosDB collection(s) to write to, for example, mydb.mycollection. Separate multiple collections by commas.

You may use the % wildcard, for example, mydb.%. Note that data will be written only to collections that exist when the Striim application starts. Additional collections added later will be ignored until the application is restarted.

Connection Retry

String

retryInterval=60, maxRetries=3

With the default setting, if a connection attempt is unsuccessful, the adapter will try again in 30 seconds (retryInterval. If the second attempt is unsuccessful, in 30 seconds it will try a third time (maxRetries). If that is unsuccessful, the adapter will fail and log an exception. Negative values are not supported.

Connection URL

String

Specify <host>:<port>, for example, mymongcos.mongo.cosmos.azure.com:10255. Copy the host and port values from the Connection String page under Settings for your Azure Cosmos DB API for MongoDB account.

Excluded Collections

String

Any collections to be excluded from the set specified in the Collections property. Specify as for the Collections property.

Ignorable Exception Code

String

By default, if the target returns an error, the application will crash. Specify DUPLICATE_KEY, KEY_NOT_FOUND, or NO_OP_UPDATE to ignore such errors and continue. To specify both, separate them with a comma.

Key Separator

String

:

Inserted between values when generating document keys by concatenating column or field values. If the values might contain a colon, change this to something that will not occur in those values.

Overload Retry Policy

String

retryInterval=1, maxRetries=10

With the default setting, if CosmosDB rejects a write because it exceeds the throughput limit, the adapter will try again in one second (retryInterval. If the second attempt is unsuccessful, in one second it will try a third time, and so on through ten attempts (maxRetries). If the tenth retry is unsuccessful, the adapter will halt and log an exception.

Negative values are not supported.

See Prevent rate-limiting errors for Azure Cosmos DB API for MongoDB operations.

Parallel Threads

Integer

See Creating multiple writer instances.

Password

com. webaction. security. Password

The password for the specified Username.

Username

String

A MongoDB user with the readwrite role on the target collection(s).