Iceberg Writer
Apache Iceberg is an open table format to organize large analytic tables as files in data lakes. It is designed to provide enhanced performance and compliance capabilities, such as enhanced ACID compliance, the ability to record transactional data efficiently, and perform SQL operations with improved scalability. For more information, see the Apache Iceberg documentation.
Iceberg Writer is a Striim target adapter capable of writing data in Iceberg format in a data lake. The adapter requires a compute engine and a catalog.
In this release, the only supported data lake is Google Cloud Storage (GCS). This requires the Google Dataproc compute engine and a BigQuery Metastore, Nessie, or Polaris catalog. Hadoop in the GCS data lake is also supported as a catalog, but is recommended only for development and testing, as performance may not be adequate for a production environment.
Iceberg Writer summary
Supported sources | Iceberg Writer can write data from all sources supported by Striim. |
Authentication | Iceberg Writer authenticates its connection to GCS and Google Dataproc using service account keys. |
Supported write modes | Iceberg Writer supports two write modes:
|
Additional writing features |
|
Supported staging areas | Iceberg requires a staging area to temporarily hold new data while it is being written to tables. In this release, Iceberg Writer supports only Google Cloud Storage (GCS). |
Resilience and recovery |
|
Performance | In append-only mode, parallel threads (see Creating multiple writer instances (parallel threads)) can increase throughput to the target in certain situations. |
Programmability |
|
Metrics and auditing | Key metrics are available through Striim's monitoring features (see Data warehouse monitoring metrics and Iceberg Writer monitoring metrics). |
drivers and other third-party libraries | Iceberg Writer uses google-cloud-storage version 2.43.1 and google-cloud-dataproc version 4.48.0. For BigQuery Metastore it uses iceberg-bigquery-catalog version 1.5.2-1.0.1-beta. |
Key limitations |
|