Google Drive Reader
Note
This adapter is in preview and is available on Striim Developer only. See Striim Developer for more information.
Google Drive is a cloud-based storage service that allows users to store, access, and share files online across multiple devices. It offers free web-based applications for creating documents, spreadsheets, and presentations, with 15 GB of free storage. Users can easily upload, organize, and collaborate on files from anywhere with an internet connection.
You can use the Google Drive Reader to connect with the Google drive platform and read data from objects/tables.
Feature summary
Feature | Supported? | Notes | |
---|---|---|---|
Objects | Standard objects | ✓ | |
Custom objects | ✓ | ||
Authentication | Basic authentication | Username and password | |
OAuth authentication | ✓ | Manual configuration based | |
Custom authentication methods | Not all methods may be supported | ||
Operations | Automated mode | ✓ | |
Initial load | ✓ | ||
Pull-based incremental load | ✓ | ||
Push-based incremental load | |||
Automated pipeline | |||
Governance | Connection profile | ||
Sherlock AI | |||
Sentinel AI | |||
Schema handling | Initial schema creation | ✓ | Works with supported targets |
Schema evolution | |||
Setup | Wizard template | ||
Flow Designer | ✓ | ||
Striim TQL | ✓ | ||
Runtime | Resilience/recovery | ✓ | |
Parallel execution | |||
Metrics | ✓ | Standard metrics |
Supported authentication method
The Google Drive Reader supports OAuth authentication. To obtain the client credentials needed:
Create a project in the Google Cloud console.
Enable the Google Drive API.
To get the authorization URL for the Google Drive API construct this URL with the following key parameters:
https://accounts.google.com/o/oauth2/v2/auth? client_id=CLIENT_ID&redirect_uri=http://127.0.0.1/&response_type=code&scope= https://www.googleapis.com/auth/drive.readonly https://www.googleapis.com/auth/drive.metadata https://www.googleapis.com/auth/drive.file&access_type=offline
CLIENT_ID
: The client ID from your Google Cloud console credentials.scope
: Specifies full Google Drive API access.redirect_uri
: Indicates the authorization code will be returned in the browser window.
To get a refresh token:
curl -X POST \ https://oauth2.googleapis.com/token \ -H "Content-Type: application/x-www-form-urlencoded" \ -d "code=[AUTHORIZATION_CODE]" \ -d "client_id=[CLIENT_ID]" \ -d "client_secret=[CLIENT_SECRET]" \ -d "redirect_uri=http://127.0.0.1/" \ -d "grant_type=authorization_code"
[AUTHORIZATION_CODE]
: The authorization code you received from the Google OAuth consent screen.[CLIENT_ID]
: Your Google Cloud console client ID.[CLIENT_SECRET]
: Your Google Cloud console client secret.
Supported objects
The Google Drive Reader supports reading the following objects from Google Drive:
Docs
Drives
Files
Folders
Permissions
Photos
Sheets
Videos
Google Drive Reader properties
Property | Type | Default value | Notes |
---|---|---|---|
Client ID | String | Client ID of the private app registered in the Google Cloud console. | |
Client secret | Password | Client secret of the private app registered in the Google Cloud console. | |
Connection pool size | Integer | 20 | Specifies the maximum number of active connections. |
Exclude tables | String | A list of tables excluded from read operations. Typically used to create a list of exceptions when the Tables property includes wildcards. Misconfiguration of the Tables and Exclude Tables properties can cause "Invalid table names" errors. | |
Incremental load marker | String | The incremental load marker is a unique incremental column in each object used for incremental load. When no marker is specified, tables are resynced at each polling interval. Specify the name of the column that contains the start position value. This column must meet the following criteria:
| |
Migrate schema | Boolean | False | Only available in Initial Load or Automated mode. Set to |
Mode | Select list:
| Automated | Automated mode applies incremental updates to objects that support incremental load and performs full resyncs for objects that do not support incremental load. |
Polling interval | Integer | 5m | Specifies an interval as an integer followed by a unit. Supported units are days ( |
Refresh token | Password | An OAuth 2.0 refresh token.Use the value generated while creating the token. | |
Start Position | String | %=-1 | Value of the incremental load marker that defines the initial reading position. |
Tables | String | A semicolon-delimited (;) list of objects to read from the source. Supports the | |
Thread pool count | Integer | 10 | The number of parallel running threads. The default value of zero specifies single-threaded operation. When the value of the thread pool counter is higher than the connection pool size, large data ingestion operations can cause the app to halt. Since best performance is achieved when using one thread for each table being synced, increasing the size of the connection pool to match the number of threads in use is a performance best practice. |