Skip to main content

Data warehouse monitoring metrics

The following monitoring metrics are returned for Azure Synapse Writer, BigQuery Writer, Databricks Wrtier, Fabric Data Warehouse Writer, Iceberg Writer, and Snowflake Writer.

Metric name

Measure

Scope

Updated

Description

Connection Retry Information

total number of reconnects and the last known reconnect time

Since the application was started

the total number of reconnects and the last known reconnect time that occurred 

Visible only if a retry attempt has been made by the adapter.

Discarded Event Count

Total number of events discarded

Since the application was started

After each event/batch execution

the total number of events discarded across all tables

Queued Batches Size In Bytes

size of all queued batches

Since the application was started

the size of all queued batches in bytes

For BigQuery and Snowflake only: calculated and populated only when the application is configured in streaming mode

Table Write Info

JSONArray listing DWH-specific metrics

Since the application was started

After each batch execution

a list of sub-metrics showing granular table-level write info

See details in table below.

Target Freshness

Time since a batch was executed

Since the application was started

After each batch execution

the time since a batch was executed across all tables.

Write Timestamp

Time when a batch was executed

Since the application was started

After each batch execution

the time the last batch was executed

Table Write Info sub-metrics

Name

Measure

Scope

Updated

Description

Avg Batch Size in bytes

Average Size of a Batch

Since the application was started

After each batch execution

The average size of a batch in bytes.

Avg Compaction Time in ms

Average Compaction Time

Since the application was started

After each batch execution

The overall average compaction time taken across all the batches

Avg Event Count Per Batch

Average Event Count present in a Batch

Since the application was started

After each batch execution

The Average Number of events present in a batch

Avg Integration Time in ms

Average Overall Time for a Batch to make it to the Target Table

Since the application was started

After each batch execution

Average Time taken to move a processed batch to the target table. This includes all the sub processes like Upload, Compaction, Merge etc.,

Avg Merge Time in ms

Average Merge Time

Since the application was started

After each batch execution

The overall average merge time taken across all  batches

Avg Micro Batch Count Per Batch

Average number of micro batches

Since the application was started

After each batch execution

Average number of micro batches present in a batch

Visible and populated only when using Streaming mode in DWH Adapters.

Avg Stage Resources Management Time in ms

Average time taken to clear/create the staging resources

Since the application was started

After each batch execution

The Average time taken to clear/create the staging resources such as stage table & area.

Avg Upload Time in ms

Average Upload Time

Since the application was started

After each batch execution

The overall average upload time taken across all the batches

Avg Waiting Time in Queue in ms

Average Batch waiting time in queue

Since the application was started

After each batch execution

The average time batches spent waiting in the queue.

Last Applied DDL Statement

Latest DDL statement executed in the Target Table

Most Recent Object

After each DDL Batch

The last executed DDL statement for a specific table.

This metric is visible only when a DDL execution has occurred.

Last Applied DDL Time

Latest DDL Batch Executed Time

Most Recent Object

After each DDL Batch

Time of the last executed DDL for a specific table.

This metric is visible only when a DDL execution has occurred.

Last batch info

number of updates, inserts, deletes, and other batch-specific metrics.

Most Recent Object

After each batch execution

Shows the granular metrics from the last executed batch.

Includes details like the number of updates, inserts, deletes, and other batch-specific metrics.

Last successful merge time

Latest Batch Executed Time

Most Recent Object

After each batch execution

Time of the last executed task for a specific table.

Updated in both Append Only and Merge modes, regardless of naming.

The batch execution end time is noted and updated as the last successful merge time for each batch.

Mapped Source Table

Source table name

Most Recent Object

After each batch execution

Indicates the source table to which the target table is mapped.

Max Integration Time in ms

Maximum time taken for a batch to make it to the target table

Since the application was started

After each batch execution

The maximum taken for a processed batch to reach the target table

Min Integration Time in ms

Minimum time taken for a batch to make it to the target table

Since the application was started

After each batch execution

The minimum time taken for a processed batch to reach the target table

Total Batches Created

Total Batches Created

Since the application was started

After each batch execution

Total number of tasks created for a specific table

Total Batches Queued

Total Batches Queued

Since the application was started

After each batch execution

Total number of tasks currently in queue for a specific table.

Total Batches Uploaded

Total Batches Uploaded

Since the application was started

After each batch execution

Total number of tasks successfully executed for a specific table.

Total event info

number of DDLs, inserts, deletes, pk updates, and total events merged.

Since the application was started

After each batch execution

Depicts the overall event count for a specific table.

Total Micro Batches Created

Total Micro Batches Created

Since the application was started

After each micro-batch execution

Total number of micro-batches created for a specific table.

Total Micro Batches Uploaded

Total Micro Batches Uploaded

Since the application was started

After each micro-batch execution

Total micro-batches successfully executed for a specific table.

Troubleshooting using data warehouse monitoring metrics

Poor Target Freshness:

  • Issue: If the target freshness is high (e.g., several hours or more), it indicates that the target adapter has not processed any batches for a significant amount of time.

  • Cause: This may happen if the adapter is facing connectivity issues, the system is overloaded, or there are configuration problems that prevent it from processing batches.

  • Suggestion:

    • Check the adapter logs for any connection errors or retry attempts.

    • Ensure that the adapter has the correct permissions and access to the target system.

    • Validate the system resources like memory and CPU usage to make sure the adapter is not being starved of resources.

    • Review the configuration settings and ensure the adapter is set to process batches at the required frequency.

High Number of Total Batches Queued / High Average Waiting Time in Queue:

  • Issue: A high number of total batches queued / high average waiting time in queue means that the events for the target table are being received faster than the target adapter can process them, causing batch accumulation in the queue and unnecessary memory usage.

  • Cause: The incoming event rate is higher than the batch processing rate of the target adapter, leading to excessive queuing and potential delays.

  • Suggestion:

    • For optimal performance, adjust the batch policy so that the rate at which events are received aligns more closely with the processing rate of the target adapter.

    • Review the batch size, batch timeout, and other relevant policies to ensure they are configured for efficient throughput.

    • Monitor the adapter’s performance regularly to ensure that batch queuing remains under control and no memory is unnecessarily held.

High Difference between Max Integration Time and Average Integration Time:

  • Issue: A significant difference between the maximum integration time and the average integration time indicates that one or more batches have taken significantly longer to process than usual. This could be due to connectivity issues or other delays, causing that specific batch to take longer.

  • Cause: Connectivity issues, system overload, or other external factors can cause one batch to take longer, resulting in a queue buildup that slows down the processing of subsequent batches.

  • Suggestion:

    • Investigate the batch that took longer to process by reviewing the logs and checking for connectivity issues or system resource bottlenecks.

    • Optimize resource allocation to prevent such delays in the future.

    • Consider adjusting the retry and timeout settings for the adapter to handle delayed batches more efficiently.

High Discarded Event Count:

  • Issue: A high number of discarded events indicates that many events are not being processed due to missing table mappings or other issues.

  • Cause: This can happen if the events don’t have suitable mappings to a target table, or if a batch is ignored due to configuration or exception handling.

  • Suggestion:

    • Review the mapping configurations to ensure that all events have corresponding target tables.

    • Investigate whether the discarded events contain any critical data that should be reprocessed or mapped.

Slow Average Integration Time:

  • Issue: A high average integration time means that batches are generally taking longer than expected to be processed by the adapter.

  • Cause: This could be due to inefficient resource usage, connectivity issues, or an overly aggressive batch policy that leads to larger batch sizes or high batch timeout values.

  • Suggestion:

    • Monitor resource utilization (memory, CPU) to identify potential bottlenecks.

    • Check network performance and connectivity between the source and target systems.

    • Tune batch size, batch timeout, and other processing parameters to reduce integration time.

    • Partition the target tables.