Glossary
Term | Definition |
|---|---|
Validation | A Validation represents a structured comparison between a Source dataset and a Target dataset. The datasets may be static or actively changing while the comparison is performed; Validata always evaluates their state at the time each run begins. The rules that govern how the comparison is executed—including scope, method, and the set of Validation Pairs—are defined in the Validation Configuration. A Validation can be run on demand or according to a recurring schedule. Each execution of the comparison is a Validation Run, which always uses the most recent configuration in effect at the time the run starts. |
Dataset | A dataset represents a structured collection of records organized in a defined format. Examples include database tables, data-warehouse tables, Kafka topics, or structured data files. In this edition, the term dataset refers specifically to tabular structures—such as database or data-warehouse tables—that Validata can read and compare during a Validation. |
Source | The Source is the dataset designated as the trusted reference or system of records for the validation process. Generally, this is the upstream dataset in a replication pipeline, or the dataset you consider as the reference for any comparison. |
Target | The Target is the dataset compared against the trusted Source dataset to identify discrepancies. Typically, this is the downstream dataset in a replication pipeline that is being validated for accuracy against the Source. |
Table | A table or a database table in a relational database represents a collection of data organized in rows and columns. The table represents the smallest logical data unit for which Validata generates a comparison report. In Validata, reporting is anchored at the table level to align with how you may naturally conceptualize your data. For example, you may want to know whether the tables in the Source are in sync or out of sync with the corresponding tables in the Target. |
Validation Configuration | The Validation Configuration defines parameters of a comparison, including:
After a Validation Configuration is created, certain attributes—specifically the Validation name, Source and Target definitions, and the Validation Type and Scope—are immutable. Other attributes can be modified, and any updates will be applied in the subsequent Validation Run. A Validation Configuration can be duplicated or deleted. |
Connection Profile | A Connection Profile is a reusable configuration object that securely stores the authentication and connection attributes required to access an external system from within Validata. It may include credentials and tokens such as usernames and passwords, OAuth tokens, API keys, access key/secret key pairs, or Entra ID (formerly Azure AD) tokens. All sensitive fields—such as passwords and tokens—are encrypted using AES-256 within Validata Historian and are never displayed in clear text, even to privileged users. A single Connection Profile can be referenced by multiple Validations. Any updates made to a Connection Profile are automatically reflected across all associated Validations. |
Validation Pair | A Validation Pair is a pair of tables that a Validation compares according to the validation configuration. It represents the fundamental mapping relationship established between a complete Source table and its corresponding Target table or between subsets of the Source table and the corresponding Target .table. It is configured as part of the Validation Configuration and includes a direct or derived mapping of individual columns from the Source table to the corresponding columns in the Target table. Each Validation Pair must comply with the rules of the selected Validation Type; pairs that do not conform cannot be included in the validation process. This ensures that comparisons are accurate, consistent, and aligned with the chosen validation approach. |
Column Pair | A column pair is a set of two columns—one from the Source table and one from the corresponding Target table in the Validation Pair—that have been explicitly mapped for data comparison during validation. Validata maps a column in the Source table to a column in the Target table so that it can evaluate and report whether the data in the Target column matches that in the Source column. |
Comparison Key or Key | A Comparison Key is a set of mapped columns defined within a Validation Pair to identify and match records between a Source table and a Target table. A Comparison Key is created by pairing column(s) from the Source table with corresponding column(s) from the Target table. These column pairs must represent the same logical data attribute so that Validata can correctly align records across both tables for comparison (e.g. A comparison key may consist of:
Validata uses Comparison Keys to determine which records should be compared. A Source record is compared to a Target record only when their key values are identical. All Validation Types require a Comparison Key, except for Custom Validation, where it is optional but strongly recommended. |
Validation Set | A Validation Set is a collection of one or more Validation Pairs that are validated together under a single Validation Configuration. Validation Sets are useful when you want to validate multiple Validation Pairs, ensuring consistency and efficiency across the validation process. Validata automatically generates the Validation Pairs, mapping Source tables to their corresponding Target tables and then mapping the columns within each Validation Pair. |
Validation Type | Validata supports the following validation methods:
|
Validation Run | A Validation Run is a single execution of a Validation. During each run, Validata compares the Source and Target datasets specified in the Validation Configuration. For every Validation Pair in the Configuration, Validata compares the data in the mapped Source and Target tables, identifies mismatches, generates a Validation Run Report, and—when applicable—creates reconciliation scripts to help resolve data discrepancies. A Validation may include multiple runs. For example, a Validation scheduled every six hours produces four runs per day, each with its own Validation Report. A one-time Validation can also have multiple runs if you initiate them manually. Validata executes only one active run per Validation at any given time. You cannot start a new run while another run of the same Validation is in progress. If a Validation is scheduled to run on a recurring cadence, Validata automatically skips a scheduled run when it detects that a previous run of the same Validation is still underway. |
Validation Run Report | The Validation Run Report is the output generated at the end of each Validation Run. Validata produces a unique report for every run, regardless of the result. The report indicates which Validation Pairs are in sync and which are out of sync. |
Validation Pair Report | During a Validation Run, Validata compares the mapped Source and Target tables in every Validation Pair defined in the Validation Configuration, and generates Validation Pair Report for every pair that was evaluated. Each report lists the records that are out of sync between the Source and Target tables and includes an optional reconciliation script that you can run directly on the data systems to address the discrepancies. |
Validation Pair comparison: Revalidation | An optional two-phase validation flow designed for continuously replicated datasets where recent Source updates may not yet appear in the Target. In the first phase, Validata performs the standard comparison and identifies All built-in validation methods (Vector, Fast Record, Full Record, Key, and Interval) support Revalidation; however, Custom Validation does not. |
Validation Pair comparison: Halting due to excessive errors | An optional guardrail that automatically stops the comparison of a Validation Pair when the ongoing percentage of |
Validation Pair comparison: Null records | A Null Record is defined within the context of a Validation Pair. It is any record in the Source or Target table of a Validation Pair that contains a null value in any of the user-defined comparison key columns. Validata automatically filters out these records during data ingestion, meaning they are not processed, compared, or included in any validation metrics or reconciliation scripts. |
Validation Pair comparison: Duplicate records | A Duplicate Record is defined within the context of a Validation Pair. It is any record in the Source or Target table whose comparison key values correspond to a Duplicate Key. A Duplicate Key is a comparison key value (or combination of values) that occurs in two or more records within either the Source or the Target table of the Validation Pair after Validata has filtered out null records. Validata excludes all Duplicate Records from its in-sync and out-of-sync comparisons between the Source and Target tables. |
Validation Pair comparison: In-Sync records | An |
Validation Pair comparison: Out-of-Sync records | An
|