Azure Synapse Writer
Writes to Azure Synapse (formerly Azure SQL Data Warehouse).
Prerequisites:
Deploy an Azure Synapse instance.
Deploy an Azure Data Lake Storage Gen2 instance to be used for staging the data. (Alternatively, use an Azure Blob storage account, but Microsoft no longer recommends this.)
Optionally (strongly recommended by Microsoft), connect the Azure Synapse instance and the Azure Data Lake Storage Gen2 instance with an Azure Virtual Network (VNet). See Impact of using VNet Service Endpoints with Azure storage, particularly the prerequisites and the instructions for creating a database master key.
Create an Azure Synapse login for use by Striim.
Create an Azure Synapse database scoped credential with the storage account name as the IDENTITY and the storage account access key as the SECRET. For example:
CREATE MASTER KEY ENCRYPTION BY PASSWORD='<password>'; CREATE DATABASE SCOPED CREDENTIAL AppCred WITH IDENTITY = '<storage account name>', SECRET = '<access key>';
You can view scoped credentials with the command:
SELECT * FROM sys.database_scoped_credentials;
property | type | default value | notes |
---|---|---|---|
Account Access Key | String | the account access key for the Blob storage account from Storage accounts > <account name> > Access keys | |
Account Name | String | the Blob storage account name | |
Client Configuration | String | If using a proxy, specify | |
Column Delimiter | String | | | If the data to be written may contain the default column delimiter (ASCII / UTF-8 124), specify a different delimiter that will never appear in the data. |
Connection URL | String | the JDBC connection URL for Azure Synapse, in the format Alternatively, you may use Active Directory authentication (see Supporting Active Directory authentication for Azure). | |
Excluded Tables | String | When a wildcard is specified for Tables='HR%', ExcludedTables='HRMASTER' | |
Ignorable Exception Code | String | Set to TABLE_NOT_FOUND if you do not want the application to crash when Striim tries to write to a table that does not exist in the target database. See Handling "table not found" errors for more information. | |
Mode | String | MERGE | With the default value of MERGE, inserts and deletes in the source are handled as inserts and deletes in the target. With this setting:
Set to APPENDONLY to handle all operations as inserts. With this setting:
|
Parallel Threads | Integer | ||
Password | encrypted password | The password for the specified user. See Encrypted passwords. | |
Storage Access Driver Type | String | WASBS | Set to ABFS if you are using an Azure Data Lake Storage Gen2 instance for staging the data, or if you are using a general-purpose blob storage instance connected to Synapse using VNet or across a firewall. (See The Azure Blob Filesystem driver (ABFS) for more information.) Leave at the default setting WASBS if using a general-purpose V1 or V2 blob storage account without VNet or a firewall. |
Tables | String | The name(s) of the table(s) to write to. The table(s) must exist in the DBMS and the user specified in Username must have insert permission. When the target's input stream is a user-defined event, specify a single table. If the source table has no primary key, you may use the When the input stream of the target is the output of a DatabaseReader, IncrementalBatchReader, or SQL CDC source (that is, when replicating data from one database to another), it can write to multiple tables. In this case, specify the names of both the source and target tables. You may use wildcards for the table names, but not for the schema or database. For example: source.emp,target.emp source.db1,target.db1;source.db2,target.db2 source.%,target.% source.mydatabase.emp%,target.mydb.% source1.%,target1.%;source2.%,target2.% MySQL and Oracle names are case-sensitive, SQL Server names are not. Specify names as See Mapping columns for additional options. | |
Upload Policy | String | eventcount:10000, interval:5m | The upload policy may include eventcount, interval, and/or filesize (see Setting output names and rollover / upload policies for syntax). Cached data is written to the storage account every time any of the specified values is exceeded. With the default value, data will be written every five minutes or sooner if the cache contains 10,000 events. When the app is undeployed, all remaining data is written to the storage account. |
Username | String | the user name Striim will use to log in to the Azure Synapse specified in ConnectionURL |
The following sample application would read from Oracle using IncrementalBatchReader and write to Azure Synapse.
CREATE SOURCE ibr2azdw_Source USING IncrementalBatchReader ( Username: 'striim', Password: '********', ConnectionURL: '192.0.2.1:1521:orcl', Tables: 'MYSCHEMA.TABLE1', CheckColumn: 'MYSCHEMA.TABLE1=UUID', StartPosition: 'MYSCHEMA.TABLE1=1234' ) OUTPUT TO ibr2azdw_Source_Stream ; CREATE TARGET ibr2azdw_AzureSynapseTarget1 USING AzureSQLDWHWriter ( Username: 'striim', Password: '********', ConnectionURL: 'jdbc:sqlserver://testserver.database.windows.net:1433;database=rlsdwdb', Tables: 'MYSCHEMA.TABLE1,dbo.TABLE1', AccountName: 'mystorageaccount' AccountAccessKey: '********' ) INPUT FROM ibr2azdw_Source_Stream;
Azure Synapse data type support and correspondence
TQL type | Azure Synapse type |
---|---|
java.lang.Byte | tinyint |
java.lang.Double | float |
java.lang.Float | float |
java.lang.Integer | int |
java.lang.Long | bigint |
java.lang.Short | smallint |
java.lang.String | char, nchar, nvarchar, varchar |
org.joda.time.DateTime | datetime, datetime2, datetimeoffset |
When the input of an Azure Synapse target is the output of a MySQL source (DatabaseReader, IncremenatlBatchReader, or MySQLReader):
MySQL type | Azure Synapse type |
---|---|
bigint | bigint, numeric |
bigint unsigned | bigint |
binary | binary |
char | nchar |
date | date |
datetime | datetime, datetime2, datetimeoffset |
decimal | decimal |
decimal unsigned | decimal |
double | money, smallmoney |
float | float, real |
int | int |
int unsigned | int |
longblob | varbinary |
longtext | varchar |
mediumblob | binary |
mediumint | int |
mediumint unsigned | int |
mediumtext | varchar |
numeric unsigned | int |
smallint | smallint |
smallint unsigned | smallint |
text | varchar |
time | time |
tinyblob | binary |
tinyint | bit (if only one digit), tinyint |
tinyint unsigned | tinyint |
tinytext | varchar |
varbinary | varbinary |
varchar | nvarchar, varchar |
year | varchar |
When the input of an Azure Synapse target is the output of an Oracle source (DatabaseReader, IncremenatlBatchReader, or OracleReader):
Oracle type | Azure SQL Data Synapse type |
---|---|
binary_double | float |
binary_float | real |
blob | binary, varbinary |
char | char |
clob | nvarchar |
date | date |
float | float |
nchar | nchar |
nclob | varchar |
number(1) | bit |
number(10,4) | smallmoney |
number(10) | int |
number(19,4) | money |
number(19) | bigint |
number(3) | tinyint |
number(5) | char, smallint |
timestamp | datetime, datetime2, datetimeoffset |
timestamp with local timezone | datetimeoffset |
timestamp with timezone | datetimeoffset |
varchar2 | varchar |
varchar2(30) | time |
xmltype | varchar |
When the input of an AzureSynapse target is the output of a SQL Server source (DatabaseReader, IncremenatlBatchReader, or MSSQLReader):
SQL Server type | Azure Synapse type |
---|---|
bigint | bigint |
binary | binary |
bit | bit, char |
date | date |
datetime | datetime |
datetime2 | datetime2 |
datetimeoffset | datetimeoffset |
decimal | decimal |
float | float |
image | varbinary |
int | int |
money | money |
nchar | nchar |
ntext | varchar |
numeric | numeric |
nvarchar | nvarchar |
nvarchar | nvarchar |
real | real |
smalldatetime | smalldatetime |
smallint | smallint |
smallmoney | smallmoney |
text | varchar |
time | time |
tinyint | tinyint |
varbinary | varbinary |
varchar | varchar |
xml | varchar |