Skip to main content

ADLS Gen1 Writer

Writes to files in Azure Data Lake Storage Gen1. A common use case is to write data from on-premise sources to an ADLS staging area from which it can be consumed by Azure-based analytics tools.

Note

Microsoft plans to retire the ADLS Gen1 service by February 29, 2024 (see Microsoft’s announcement). After that, ADLS Gen1 Writer will be unusable, so will be removed from subsequent Striim releases. For instructions on migrating your data, see Learn / Azure / Storage / Blobs / Azure Data Lake Storage migration guidelines and patterns. To migrate a Striim application, replace the ADLS Gen1 Writer target with an ADLS Gen2 Writer target.

ADLS Gen1 Writer properties

property

type

default value

notes

Auth token Endpoint

String

the token endpoint URL for your web application (see "Generating the Service Principal" under Using Client Keys)

Client ID

String

the application ID for your web application (see "Generating the Service Principal" under Using Client Keys)

Client Key

encrypted password

the key for your web application (see "Generating the Service Principal" under Using Client Keys)

Compression Type

String

Set to gzip when the input is in gzip format. Otherwise, leave blank.

Data Lake Store Name

String

the name of your Data Lake Storage Gen1 account, for example, mydlsname.azuredatalakestore.net (do not include adl://)

Directory

String

The full path to the directory in which to write the files. See Setting output names and rollover / upload policies for advanced options.

File Name

String 

The base name of the files to be written. See Setting output names and rollover / upload policies.

Rollover on DDL

Boolean

True

Has effect only when the input stream is the output stream of a CDC reader source. With the default value of True, rolls over to a new file when a DDL event is received. Set to False to keep writing to the same file.

Rollover Policy

String

eventcount:10000, interval:30s

See Setting output names and rollover / upload policies.

This adapter has a choice of formatters. See Supported writer-formatter combinations for more information.Supported writer-formatter combinations

Data is written in 4 MB batches or whenever rollover occurs.

ADLS Gen1 Writer sample application

CREATE APPLICATION testADLSGen1;

CREATE SOURCE PosSource USING FileReader ( 
  wildcard: 'PosDataPreview.csv',
  directory: 'Samples/PosApp/appData',
  positionByEOF:false
)
PARSE USING DSVParser (
  header:Yes,
  trimquote:false
) 
OUTPUT TO PosSource_Stream;

CREATE CQ PosSource_Stream_CQ 
INSERT INTO PosSource_TransformedStream
SELECT TO_STRING(data[1]) AS MerchantId,
  TO_DATE(data[4]) AS DateTime,
  TO_DOUBLE(data[7]) AS AuthAmount,
  TO_STRING(data[9]) AS Zip
FROM PosSource_Stream;

CREATE TARGET testADSLGen1target USING ADSLGen1Writer (
  directory:'mydir',
  filename:'myfile.json',
  datalakestorename:'mydlsname.azuredatalakestore.net',
  clientid:'********-****-****-****-************',
  authtokenendpoint:'https://login.microsoftonline.com/********-****-****-****-************/oauth2/token',
  clientkey:'********************************************'
)
FORMAT USING JSONFormatter ()
INPUT FROM PosSource_TransformedStream;

END APPLICATION testADLSGen1;

Since the test data set is less than 10,000 events, and ADSLGen1Writer is using the default rollover policy, the data will be uploaded in 30 seconds.