Skip to main content

GCS Reader programmer's reference

GCS Reader properties

property

type

default value

notes

Bucket Name

String

The Google Cloud Storage bucket to read from.

For example: BucketName

Compression Type

String

Set to gzip when the files to be read are in gzip format. Otherwise, leave blank.

Connection Retry Policy

String

retryInterval=30, maxRetries=3

This policy determines how the connection should be retried on a failure. The retryInterval is the wait time after which the next try to establish connection is attempted.

On exhausting the maxRetries count, the reader will halt the application.

Download Policy

String

DiskLimit=2048, FileLimit=10

DiskLimit=<size in MB>; FileLimit=<count>

The download of objects is throttled based on the configured limits. When one of the limits is hit, the adapter waits for already downloaded objects to be processed and deleted before attempting to download the next object.

Configure DiskLimit at least twice the size of the object with maximum size. Set 0 to disable the limit.

Note: This property is disregarded when UseStreaming is set to true.

Folder Name

String

The folder path under the bucket where objects are picked and processed. When the property is left empty, adapter picks objects from the bucket's root.

Include Subfolders

Boolean

False

When you enable subfolder processing, the adapter processes objects under the configured path and recursively processes the folders under the configured path.

Object Detection Mode

Enum

GCSDirectoryListing

Choose one of these modes:

GCSDirectoryListing: The adapter fetches the metadata of all objects from the specified path (bucket, folder) and identifies the delta based on the last object fetched in the previous fetch.

GCSAuditLogNotification: The adapter fetches the "object create" audit log entries starting from the timestamp of the last object fetched in the previous fetch.

Object Filter

String

*

A wildcard of the object name.

For example:

*obj.csv

*obj*

First-Object*

*

We currently support only the "*" character.

Parser

String

Defines how to parse data from the adapter.

Polling Interval

Integer

5000 (5 seconds)

The time in milliseconds the adapter will wait before polling for changes in the configured bucket.

Private Service Connect Endpoint

String

If using Private Service Connect with Google Cloud Storage, specify the endpoint created in the target Virtual Private Cloud, as described in Private Service Connect support for Google cloud adapters.

Project Id

String

The Google Cloud Platform project for the bucket.

Service Account Key

File

The path (from root or the Striim program directory) and file name to the JSON credentials file downloaded from Google (see the information about Service Accounts in Prerequisites). You must copy this file to the same location on each Striim server that will run this adapter, or to a network location accessible by all servers.

If you do not specify a value for this property, Striim will use the $GOOGLE_APPLICATION_CREDENTIALS environment variable.

Start Timestamp

String

In Incremental mode, this property lets you start from a particular point in time.

This property is not honored in case of resuming from a recovery and the position supplied by the recovery will take precedence.

Supported formats and examples:

YYYY-MM-DD'T'hh:mm:ss.sssTZD

For example:

2022-01-21T13:38:00.4

2022-01-21T13:38:00.811

File processing starts from the specified point of time, meaning that for any file modified at or after the specified StartTimeStamp, those files will be processed.

Where:

  • YYYY = four-digit year

  • MM = two-digit month

  • DD = two-digit day

  • hh = two digits of hour

  • mm = two digits of minute

  • ss = two digits of second

  • sss = one to three digits representing a decimal fraction of a second

  • TZD = optional time zone

Use Streaming

Boolean

True

Streaming is the default and recommended mode of fetching objects from a bucket. Disable this mode of operation when you want to download entire files to local storage before processing them. Parquet files require downloading.