Striim 3.10.1 documentation

Snowflake Writer

Writes to one or more existing tables in Snowflake. Events are staged to local storage, Azure Storage, or AWS S3, then written to Snowflake as per the Upload Policy setting. Before writing to Snowflake, its JDBC driver must be installed as described in Installing the Snowflake JDBC driver. See OCSP Certification Checks Require Port 80 for information on firewall settings.

To evaluate Striim with Snowflake, see Getting your free trial of Striim for Snowflake.

property

type

default value

notes

Append Only

Boolean

False

With to the default value of False, updates and deletes in the source are handled inserts in the target. With this setting:

  • Updates and deletes from DatabaseReader, IncrementalBatchReader, and SQL CDC sources are handled as inserts in the target.

  • Primary key updates result in two records in the target, one with the previous value and one with the new value. If the Tables setting has a ColumnMap that includes @METADATA(OperationName), the operation name for the first event will be DELETE and for the second INSERT.

Set to True to handle updates and deletes in the source as updates and deletes in the target. With this setting:

  • Since Snowflake does not have primary keys, you may include the keycolumns option in the Tables property to specify a column in the target table that will contain a unique identifier for each row: for example, Tables:'SCOTT.EMP,mydataset.employee keycolumns(emp_num)'.

  • You may use wildcards for the source table provided all the tables have the key columns: for example, Tables:'DEMO.%,mydataset.% KeyColumns(...)'.

  • If you do not specify keycolumns , Striim will concatenate all column values and use that as a unique identifier.

Client Configuration

String

If using a proxy, specify ProxyHost=<host name or IP address>,ProxyPort=<port number>.

Column Delimiter

String

| (UTF-8 007C)

The character(s) used to delimit fields in the delimited text files in which the adapter accumulates batched data. If the data will contain the | character, change the default value to a sequence of characters that will not appear in the data.

Connection URL

String

the JDBC driver connection string for your Snowflake account

External Stage Type

String

local

With the default value, events are staged to local storage. To stage to Azure Storage, set to AzureBlob and set the Azure Storage properties. To stage to S3, set to S3 and set the S3 properties.

File Format Options

null_if = ""

Do not change unless instructed to by Striim support.

Ignorable Exception Code

String

Set to TABLE_NOT_FOUND if you do not want the application to crash when Striim tries to write to a table that does not exist in the target database. See Handling "table not found" errors for more information.

Null Marker

String

Optionally, specify a string inserted into fields in the stage files to indicate that a field has a null value. These are converted back to nulls in the target tables. If any field might contain the string NULL, change this to a sequence of characters that will not appear in the data.

When you set a value for Null Marker, set the same value for File Format Options. For example, if Null Marker is xnullx, File Format Options must be null_if="xnullx".

Optimized Merge

Boolean

false

Set to true only when the target's input stream is the output of an HP NonStop reader, MySQL Reader, or OracleReader source, and the source events will include partial records. For example, with Oracle Reader, when supplemental logging has not been enabled for all columns, partial records are sent for updates. When the source events will always include full records, leave this set to false.

Parallel Threads

Integer

See Creating multiple writer instances.

Password

encrypted password

The password for the specified user. See Encrypted passwords.

Tables

String

The name(s) of the table(s) to write to. The table(s) must exist in the DBMS and the user specified in Username must have insert permission.

Snowflake table names must be specified in uppercase.

If the source table has no primary key, you may use the KeyColumns option to define a unique identifier for each row in the target table: for example, Tables:'SOURCEDB.EMP,MYDB.MYSCHEMA.EMP KeyColumns(EMP_ID)'. If necessary to ensure uniqueness, specify multiple columns with the syntax KeyColumns(<column 1>,<column 2>,...). You may use wildcards for the source table, provided all the tables have the key columns: for example, Tables:'SOURCEDB.%,MYSCHEMA.% KeyColumns(...)'. If the source has no primary key and KeyColumns is not specified, the concatenated value of all source fields is used as the primary key in the target.

When the target's input stream is a user-defined event, specify a single table.

When the input stream of the target is the output of a DatabaseReader, IncrementalBatchReader, or SQL CDC source (that is, when replicating data from one database to another), it can write to multiple tables. In this case, specify the names of both the source and target tables. You may use wildcards for the table names, but not for the schema or database. For example:

source.emp,MYDB.MYSCHEMA.EMP
source.%,MYDB.MYSCHEMA.%

MySQL and Oracle names are case-sensitive, SQL Server names are not. Specify names as <schema name>.<table name> for MySQL and Oracle and as <database name>.<schema name>.<table name> for SQL Server.

See Mapping columns for additional options.

Upload Policy

String

eventcount:10000, interval:5m

The upload policy may include eventcount, interval, and/or filesize (see Setting output names and rollover / upload policies for syntax). Cached data is written to the storage account every time any of the specified values is exceeded. With the default value, data will be written every five minutes or sooner if the cache contains 10,000 events. When the app is undeployed, all remaining data is written to the storage account.

Username

String

a Snowflake user with SELECT, INSERT, UPDATE, and DELETE privileges on the tables to be written to and the CREATE TABLE privileges on the schema specified in the connection URL

Azure Storage properties for SnowflakeWriter

property

type

default value

notes

Azure Account Access Key

encrypted password

the account access key from Storage accounts > <account name> > Access keys

Azure Account Name

String

the name of the Azure storage account for the blob container

Azure Container Name

String

the blob container name from Storage accounts > <account name> > Containers

If it does not exist, it will be created.

S3 properties for SnowflakeWriter

Specify either the access key and secret access key or an IAM role.

property

type

default value

notes

S3 Access Key

String

an AWS access key ID (created on the AWS Security Credentials page) for a user with read and write permissions on the bucket (leave blank if using an IAM role)

S3 Bucket Name

String

Specify the S3 bucket to be used for staging. If it does not exist, it will be created.

S3 IAM Role

String

an AWS IAM role with read and write permissions on the bucket (leave blank if using an access key)

S3 Region

String

the AWS region of the bucket

S3 Secret Access Key

encrypted password

the secret access key for the access key

SnowflakeWriter sample application

The following sample application will write data from PosDataPreview.csv to Snowflake. The target table must exist.

CREATE SOURCE PosSource USING FileReader (
  wildcard: 'PosDataPreview.csv',
  directory: 'Samples/PosApp/appData',
    positionByEOF:false )
PARSE USING DSVParser (
  header:Yes,
  trimquote:false )
OUTPUT TO PosSource_Stream;
 
CREATE CQ PosSource_Stream_CQ
INSERT INTO PosSource_TransformedStream
SELECT TO_STRING(data[1]) AS MerchantId,
  TO_DATE(data[4]) AS DateTime,
  TO_DOUBLE(data[7]) AS AuthAmount,
  TO_STRING(data[9]) AS Zip
FROM PosSource_Stream;

CREATE TARGET SnowflakeDemo (
  ConnectionURL: '<JDBC connection string',
  username: 'striim',
  password: '********',
  Tables: 'MYDB.MYSCHEMA.POSDATA',
  appendOnly: true
)
INPUT FROM PosSource_TransformedStream;

The above example shows how to use SnowflakeWriter with an input of a user-defined type. For examples of applications where the input is the output of a CDC reader, DatabaseReader, or IncrementalBatchReader, see Replicating Oracle data to Snowflake.

SnowflakeWriter data type support and correspondence

TQL type

Snowflake type

Boolean

BOOLEAN

Byte, Integer, Long, Short

BIGINT, DOUBLE, FLOAT, INT, NUMBER, SMALLINT 

DateTime

DATE, DATETIME, TIMESTAMP_L, TIMESTAMP_NTZ, TIMESTAMP_TZ

Double, Float

DOUBLE, FLOAT

String

CHAR, VARCHAR

When the input of a SpannerWriter target is the output of an Oracle source (DatabaseReader, IncremenatlBatchReader, or MySQLReader):

Oracle type

Snowflake type

BINARY

BINARY_DOUBLE

DOUBLE, FLOAT, NUMBER

BINARY_FLOAT

DOUBLE, FLOAT, NUMBER

BLOB

BINARY, VARCHAR

CHAR

CHAR, VARCHAR

CLOB

BINARY, VARCHAR

DATE

DATE

LONG

LONG RAW

NCHAR

CHAR, VARCHAR

NCLOB

NVARCHAR2

CHAR, VARCHAR

NUMBER

NUMBER(precision,scale)

RAW

TIMESTAMP

TIMESTAMP_NTZ

TIMESTAMP WITH LOCAL TIMEZONE

TIMESTAMP_LZ

TIMESTAMP WITH TIMEZONE

TIMESTAMP_LTZ

VARCHAR2

CHAR, VARCHAR

XMLTYPE