Skip to main content

Datatype comparison rules

This topic describes the rules Validata uses to compare datatypes between source and target systems during data validation. Understanding these rules helps you interpret validation results and troubleshoot out-of-sync conditions when source and target systems use different datatypes or precision settings.

Validata applies specific normalization and conversion rules depending on the datatype categories being compared. These rules ensure consistent, deterministic comparison behavior across heterogeneous data systems such as Oracle, PostgreSQL, BigQuery, Snowflake, Databricks, and SQL Server.

Numeric datatype comparisons

Validata applies different comparison strategies for numeric datatypes depending on whether the values are fixed-point, floating-point, or a combination of both.

Fixed-point numeric columns with precision and scale differences

When Validata compares fixed-point numeric values where the source and target have different scales, it rounds both sides to the smaller scale before comparing them. This approach prevents false mismatches when systems such as Oracle store fewer decimals while BigQuery or Snowflake store more.

Validata calculates the comparison scale using the following formula:

  • s₁ = scale of source.

  • s₂ = scale of target.

  • s₍comp₎ = min(s₁, s₂).

Validata rounds both values to s₍comp₎ decimal places, then performs the equality comparison. This removes scale noise and ensures the comparison reflects semantic equality rather than formatting differences.

The following example demonstrates this comparison rule:

Step

Source NUMBER(5,2)

Target DECIMAL(6,3)

Original value

112.23

112.229

Scale

s₁ = 2

s₂ = 3

Comparison scale

min(2,3) = 2

min(2,3) = 2

Rounded to 2 decimals

112.23

112.23

Result

In-Sync

Floating-point numeric columns with precision differences

When Validata compares floating-point numeric values, both endpoints are retrieved in scientific notation only. Validata takes the minimum precision of both endpoints for its comparison, with a maximum of 5 digits of precision.

Validata calculates the comparison precision using the following formula:

  • p₁ = precision of source.

  • p₂ = precision of target.

  • p₍comp₎ = min(p₁, p₂, 5).

Validata formats each value using scientific notation with p₍comp₎ digits. This guarantees consistent behavior even when Oracle, BigQuery, Snowflake, PostgreSQL, or Databricks internally represent floating-point values differently.

The following example demonstrates this comparison rule:

Step

Source BINARY_DOUBLE(3)

Target DOUBLE

Original value

1.24E+213

1.2377e+213

Precision

p₁ = 3

p₂ = 4

Comparison precision

min(3, 4, 5) = 3

min(3, 4, 5) = 3

Formatted value

1.24E+213

1.24E+213

Result

In-Sync

Fixed-point numeric columns compared with floating-point numeric columns

When a fixed-point numeric column is compared with a floating-point numeric column, Validata first converts the fixed-point value into floating-point scientific notation, then applies the floating-point comparison rule.

Validata takes the minimum precision of both endpoints, with a maximum of 5 digits of precision, and formats both values using scientific notation:

  • p₁ = precision of source fixed-point numeric.

  • p₂ = precision of target floating-point numeric.

  • p₍comp₎ = min(p₁, p₂, 5).

This ensures that fixed-point values are normalized before comparison against floating-point representations, which may include rounding or truncation differences depending on the data system.

The following example demonstrates this comparison rule:

Step

Source NUMBER(5,2)

Target FLOAT

Original value

11223

1.1223E+2

Converted to floating-point

1.1223E+5

1.1223E+2

Precision

p₁ = 5

p₂ = 4

Comparison precision

min(5, 4, 5) = 4

min(5, 4, 5) = 4

Formatted scientific value

1.122E+5

1.122E+2

Result

Out-Of-Sync

The converted fixed-point value (1.122E+5) and the floating-point value (1.122E+2) do not match after normalization, so Validata correctly reports the comparison as Out-Of-Sync.

Floating-point numeric columns compared with non-numeric columns

When a floating-point numeric column is compared with a non-numeric datatype such as VARCHAR, Validata does not apply numeric or scientific-notation comparison rules. Instead, Validata converts both values to literal strings and performs a string-to-string comparison. This rule applies to any heterogeneous mapping where the source is numeric and the target is not.

The following example demonstrates this comparison rule:

Step

Source FLOAT(5)

Target VARCHAR

Original value

1.1223E+2

"112.229"

Conversion rule

Literal string

Literal string

Converted value

"1.1223e+2"

"112.229"

Result

Out-Of-Sync

Fixed-point numeric columns compared with non-numeric columns

When a fixed-point numeric column is compared with a non-numeric datatype such as VARCHAR, Validata does not apply numeric or scale-alignment rules. Instead, Validata reads both values as literal strings and performs a direct string-to-string comparison. This rule applies to any heterogeneous mapping where the source is numeric (fixed-point) and the target is not numeric.

The following example demonstrates this comparison rule:

Step

Source NUMBER(5,2)

Target VARCHAR

Original value

112.23

"112.229"

Conversion rule

Literal string

Literal string

Converted value

"112.23"

"112.229"

Result

Out-Of-Sync

Character datatype comparisons

Validata applies different comparison strategies for character datatypes depending on whether the columns use fixed-length or variable-length storage.

Fixed-length padded character columns

When fixed-length string datatypes such as CHAR or NCHAR are involved, the physical stored values may contain right-padded spaces depending on the database system. To ensure consistent comparison across heterogeneous systems, Validata always performs a trim operation on both the source and the target before comparison. This normalization step removes trailing whitespace but does not modify the actual characters within the value.

The following example demonstrates this comparison rule:

Step

Source CHAR(10)

Target NCHAR(7)

Original value

"Hello " (with padded spaces)

"Hello " (with padded spaces)

Trim rule applied

trim("Hello ")

trim("Hello ")

Trimmed value

"Hello"

"Hello"

Result

In-Sync

Variable-length character columns

When both source and target columns use variable-length string datatypes such as VARCHAR or NVARCHAR, Validata performs an exact match comparison. No trimming is applied, and whitespace differences are treated as meaningful characters.

The following example demonstrates this comparison rule:

Step

Source VARCHAR(50)

Target NVARCHAR(10)

Original value

"Hello"

" Hello " (contains trailing spaces)

Comparison rule

Exact string match

Exact string match

Result

Out-Of-Sync

Because the strings differ (the target contains leading or trailing whitespace), Validata reports the comparison as Out-Of-Sync.

Fixed-length padded columns compared with variable-length columns

When one side uses a fixed-length padded string type such as CHAR or NCHAR and the other uses a variable-length string type, Validata applies the same trimming rule used for fixed-length comparisons. Fixed-length strings are trimmed on both sides before comparison. Variable-length strings are also trimmed when compared to a fixed-length type to ensure equality based on visible characters rather than storage padding.

This rule guarantees consistent behavior across data systems such as Oracle, SQL Server, PostgreSQL, Snowflake, BigQuery, and Databricks, regardless of how they store fixed-length character data.

The following example demonstrates this comparison rule:

Step

Source CHAR(10)

Target NVARCHAR(10)

Original value

"Hello " (stored as padded)

"Hello"

Trim rule applied

trim("Hello ")

trim("Hello")

Trimmed value

"Hello"

"Hello"

Result

In-Sync

String columns compared with non-string columns

When a string datatype such as VARCHAR, NVARCHAR, or TEXT is compared with a non-string datatype such as TIMESTAMP, DATE, or NUMBER, Validata does not apply datatype-specific parsing, casting, or normalization. Instead, Validata reads both values as literal strings and performs a string-to-string exact match comparison. This ensures consistent, deterministic behavior across heterogeneous systems where the representation of dates, timestamps, or numerics may differ.

The following example demonstrates this comparison rule:

Step

Source VARCHAR(100)

Target TIMESTAMP

Original value

"2025-11-22 10:50:11 AM"

2025-11-22 10:50:11.000

Conversion rule

Literal string

Literal string

Converted value

"2025-11-22 10:50:11 AM"

"2025-11-22 10:50:11.000"

Result

Out-Of-Sync

Date datatype comparisons

Validata applies specific normalization rules when comparing date-only datatypes across different database systems.

Date-only columns across data systems

When both the source and the target contain date-only datatypes with no time component, Validata retrieves the date from each system and normalizes it into a standard canonical format before comparison. Validata converts each value internally using TO_String(Format(value, 'yyyy-mm-dd')). This ensures that differences in how databases display dates (such as 2025-11-28 versus 28-Nov-2025) do not cause mismatches. Only the normalized date string is used for comparison.

The following example demonstrates this comparison rule:

Step

Source DATE (MySQL)

Target DATE (Snowflake)

Original value

2025-11-28

28-Nov-2025

Canonical conversion

"2025-11-28"

"2025-11-28"

Result

In-Sync

Date-only columns compared with timestamp/datetime columns

When the source contains a date-only datatype and the target contains a datetime or timestamp datatype, Validata extracts only the DATE portion from both values before comparison. Time components including hours, minutes, seconds, and milliseconds are ignored. Validata converts the extracted date portion using the canonical formatter TO_String(Format(value, 'yyyy-mm-dd')). This ensures that different representations of date-time values across systems compare solely on their calendar date.

The following example demonstrates this comparison rule:

Step

Source DATE (MySQL)

Target DATETIME

Original value

2025-11-28

2025-11-28 10:10:53.000

Extract DATE portion

"2025-11-28"

"2025-11-28"

Canonical conversion

"2025-11-28"

"2025-11-28"

Result

In-Sync

Date-only columns compared with non-date/non-datetime columns

When a date-only datatype such as DATE is compared with a non-date, non-datetime datatype such as VARCHAR, TEXT, or NUMBER, Validata treats this as heterogeneous mapping. In heterogeneous mappings, Validata does not perform date normalization or extract date parts. Instead, Validata reads both values as literal strings and performs an exact string-to-string comparison. This ensures deterministic comparison behavior even when source and target store fundamentally different datatype categories.

The following example demonstrates this comparison rule:

Step

Source DATE (Snowflake)

Target VARCHAR(100)

Original value

28-Nov-2025

2025-11-28 10:50:11 AM

Conversion rule

Literal string

Literal string

Converted value

"28-Nov-2025"

"2025-11-28 10:50:11 AM"

Result

Out-Of-Sync

Time datatype comparisons

Validata applies specific normalization and scale-alignment rules when comparing time-only datatypes.

Time-only columns across data systems

When both the source and target contain time-only datatypes such as TIME(n), Validata aligns their fractional-second precision before comparison. Validata determines the minimum scale (fractional second precision) from both endpoints using s_comp = min(s1, s2). Validata then retrieves the time value from each system, truncates it to this comparison scale, formats it using a canonical time representation, and compares the resulting strings. The canonical format used is TO_String(Format(value, 'HH:Mi:SS.[s_comp]')). This ensures that differences in microsecond or millisecond precision across systems do not cause false mismatches.

The following example demonstrates this comparison rule:

Step

Source TIME(3)

Target TIME(6)

Original value

23:59:59.112

23:59:59.111989

Scale

s₁ = 3

s₂ = 6

Comparison scale

s_comp = min(3,6) = 3

s_comp = 3

Canonical conversion

"23:59:59.112"

"23:59:59.111"

Result

Out-Of-Sync

After truncation to the comparison scale (3 fractional digits), the time values differ, so Validata reports the comparison as Out-Of-Sync.

Time-only columns compared with timestamp/datetime columns

When the source contains a time-only datatype such as TIME(n) and the target contains a datetime or timestamp datatype, Validata extracts only the TIME portion from both values before comparison. Validata does not compare the date portion of the datetime value. Only the HH:Mi:SS.sss component is used. Validata then aligns the fractional-second precision using the minimum scale and converts both sides to a canonical time format. This ensures consistent comparison even when the datetime side contains additional precision or a date component.

The following example demonstrates this comparison rule:

Step

Source TIME(3)

Target DATETIME(2)

Original value

10:10:53.112

2025-11-28 10:10:53.11

Extract TIME portion

10:10:53.112

10:10:53.11

Scale

s₁ = 3

s₂ = 2

Comparison scale

s_comp = min(3,2) = 2

s_comp = 2

Canonical conversion

"10:10:53.11"

"10:10:53.11"

Result

In-Sync

Time-only columns compared with non-time/non-datetime columns

When the source contains a time-only datatype such as TIME(n) and the target contains any non-time, non-datetime datatype such as VARCHAR, TEXT, or NUMBER, Validata treats this as heterogeneous mapping. In heterogeneous mappings, Validata does not extract or normalize time components and does not align fractional-second precision. Instead, Validata reads both values as literal strings and performs an exact string-to-string comparison. This ensures deterministic comparison behavior even when the source and target datatypes are fundamentally incompatible.

The following example demonstrates this comparison rule:

Step

Source TIME(3)

Target VARCHAR(100)

Original value

23:59:59.112

2025-11-28 10:50:11 AM

Conversion rule

Literal string

Literal string

Converted value

"23:59:59.112"

"2025-11-28 10:50:11 AM"

Result

Out-Of-Sync

DateTime and timestamp datatype comparisons

Validata applies specific normalization and scale-alignment rules when comparing datetime and timestamp datatypes.

DateTime/timestamp columns without timezone

When both the source and target contain datetime/timestamp datatypes without timezone, Validata compares them by aligning their fractional-second precision and then formatting both values into a canonical datetime string. Validata determines the minimum fractional-second scale using s_comp = min(s1, s2). Validata then truncates both datetime values to this comparison scale and formats them using the canonical pattern TO_String(Format(value, 'yyyy-mm-dd HH:Mi:SS.[s_comp]')). This ensures consistent comparison even when different systems store timestamps at varying precision levels.

The following example demonstrates this comparison rule:

Step

Source TIMESTAMP(3)

Target DATETIME2(5)

Original value

2025-11-28 10:10:53.113

2025-11-28 10:10:53.11399

Scale

s₁ = 3

s₂ = 5

Comparison scale

s_comp = min(3,5) = 3

s_comp = 3

Canonical conversion

"2025-11-28 10:10:53.113"

"2025-11-28 10:10:53.113"

Result

In-Sync

DateTime/timestamp columns compared with time-only columns

When the source contains a datetime/timestamp datatype and the target contains a time-only datatype, Validata extracts only the TIME portion from both endpoints before comparison. The date portion is ignored entirely. Validata then aligns the fractional-second precision by taking the minimum scale across both endpoints and formats both extracted time values into a canonical string before comparing. This ensures consistent comparison even when the datetime input has a different precision or contains additional date information.

The following example demonstrates this comparison rule:

Step

Source DATETIME(2)

Target TIME(3)

Original value

2025-11-28 10:10:53.11

10:10:53.112

Extract TIME portion

10:10:53.11

10:10:53.112

Scale

s₁ = 2

s₂ = 3

Comparison scale

s_comp = min(2,3) = 2

s_comp = 2

Canonical conversion

"10:10:53.11"

"10:10:53.11"

Result

In-Sync

DateTime/timestamp columns compared with date-only columns

When the source contains a datetime/timestamp datatype and the target contains a date-only datatype, Validata extracts only the DATE portion from both values before comparison. Validata ignores the time component entirely. The extracted date values are then normalized to the canonical format TO_String(Format(value, 'yyyy-mm-dd')). This ensures consistent comparison even when the datetime value contains additional precision or time information.

The following example demonstrates this comparison rule:

Step

Source DATETIME

Target DATE

Original value

2025-11-28 10:10:53.000

2025-11-28

Extract DATE portion

2025-11-28

2025-11-28

Canonical conversion

"2025-11-28"

"2025-11-28"

Result

In-Sync

DateTime/timestamp columns compared with non-time/non-date/non-datetime columns

When the source contains a datetime/timestamp datatype and the target contains any non-time, non-date, non-datetime datatype such as VARCHAR, TEXT, or NUMBER, Validata treats this as heterogeneous mapping. In heterogeneous mappings, Validata does not extract the date portion, does not extract the time portion, and does not normalize or truncate scales. Instead, both values are read as literal strings, and Validata performs a string-to-string exact match. This ensures deterministic and predictable comparison behavior even across fundamentally incompatible datatypes.

The following example demonstrates this comparison rule:

Step

Source DATETIME2(5)

Target VARCHAR(100)

Original value

2025-11-28 10:10:53.11399

2025-11-28 10:10:53 AM

Conversion rule

Literal string

Literal string

Converted value

"2025-11-28 10:10:53.11399"

"2025-11-28 10:10:53 AM"

Result

Out-Of-Sync

Timestamp with timezone comparisons

Validata applies specific normalization rules when comparing timestamp datatypes that include timezone information.

Timestamp-with-timezone columns across data systems

When both the source and target contain timestamp-with-timezone datatypes, Validata applies a multi-step normalization process:

  1. Convert both timestamps to UTC. Validata first converts each timestamp from its local offset to UTC time, ensuring both endpoints are compared on the same absolute timeline.

  2. Align fractional-second precision. Validata determines the minimum scale across both timestamp precisions using s_comp = min(s1, s2) and truncates the fractional seconds to s_comp digits.

  3. Format into a canonical UTC datetime string. Both normalized UTC timestamps are formatted as TO_String(Format(value, 'yyyy-mm-dd HH:Mi:SS.[s_comp] UTC')).

This ensures consistent, timezone-agnostic, scale-aligned comparison across systems such as Oracle, Snowflake, SQL Server, PostgreSQL, and BigQuery.

The following example demonstrates this comparison rule:

Step

Source Timestamp(5) with TZ

Target DATETIMEOFFSET(7)

Original value

2025-11-28 12:23:29.12345 +02

2025-11-28 22:23:29.1234567 +12:00

Convert to UTC

2025-11-28 10:23:29.12345

2025-11-28 10:23:29.1234567

Scale

s₁ = 5

s₂ = 7

Comparison scale

s_comp = min(5,7) = 5

s_comp = 5

Canonical UTC conversion

"2025-11-28 10:23:29.12345"

"2025-11-28 10:23:29.12345"

Result

In-Sync

Both timestamps represent the same moment in UTC, so Validata reports the comparison as In-Sync.

Timestamp-with-timezone columns compared with timestamp/datetime columns without timezone

When the source contains a timestamp-with-timezone datatype and the target contains a datetime or timestamp without timezone, Validata does not convert both values to UTC. Instead, the timezone offset from the source value is ignored, and Validata compares the local date and time portion only for both source and target. Validata then aligns precision by taking the minimum scale and formats both datetime values using the canonical pattern. This ensures consistent comparison even if the original systems stored timezone information on one side but not the other.

The following example demonstrates this comparison rule:

Step

Source Timestamp(0) with TZ

Target DATETIME2(5)

Original value

2025-11-28 12:23:29 +02

2025-11-28 10:23:29.11399

Timezone handling

Ignore timezone

No timezone

Local datetime extracted

2025-11-28 12:23:29

2025-11-28 10:23:29.11399

Scale

s₁ = 0

s₂ = 5

Comparison scale

s_comp = min(0,5) = 0

s_comp = 0

Canonical conversion

"2025-11-28 12:23:29"

"2025-11-28 10:23:29"

Result

Out-Of-Sync

The local datetime components differ, so Validata correctly reports the comparison as Out-Of-Sync.

Timestamp-with-timezone columns compared with time-only columns

When the source contains a timestamp-with-timezone datatype and the target contains a time-only datatype, Validata treats this as a time-only comparison after removing timezone context. Validata ignores the timezone offset entirely and extracts only the TIME portion from the timestamp. The date portion is also ignored. Validata then aligns fractional-second precision using the minimum scale and formats the time portion from both source and target using the canonical time format. This guarantees consistent behavior even when the timestamp-with-timezone contains higher precision than the TIME datatype.

The following example demonstrates this comparison rule:

Step

Source DATETIMEOFFSET(7)

Target TIME(3)

Original value

2025-11-28 22:23:29.1234567 +12:00

22:23:29.123

Timezone handling

Ignore timezone

No timezone

Extract TIME portion

22:23:29.1234567

22:23:29.123

Scale

s₁ = 7

s₂ = 3

Comparison scale

s_comp = min(7,3) = 3

s_comp = 3

Canonical conversion

"22:23:29.123"

"22:23:29.123"

Result

In-Sync

Timestamp-with-timezone columns compared with date-only columns

When the source contains a timestamp-with-timezone datatype and the target contains a date-only datatype, Validata simplifies the comparison by ignoring the timezone offset entirely (the timestamp's offset is discarded), extracting only the DATE portion from the timestamp (the time component is not used in the comparison), and normalizing the extracted date using the canonical formatting rule TO_String(Format(value, 'yyyy-mm-dd')). Validata then compares the normalized date string from both the source and target. This ensures consistent behavior across systems where one side stores timezone information and the other does not.

The following example demonstrates this comparison rule:

Step

Source DATETIMEOFFSET(7)

Target DATE

Original value

2025-11-28 22:23:29.1234567 +12:00

28-Nov-2025

Timezone handling

Ignore timezone

No timezone

Extract DATE portion

2025-11-28

2025-11-28

Canonical conversion

"2025-11-28"

"2025-11-28"

Result

In-Sync

Timestamp-with-timezone columns compared with non-date/non-time/non-datetime columns

When the source contains a timestamp-with-timezone datatype and the target contains any datatype that is not a date, not a time, and not a datetime (such as VARCHAR, TEXT, or NUMBER), Validata treats this as heterogeneous mapping. In heterogeneous mappings, Validata does not convert to UTC, does not extract the date, does not extract the time, and does not normalize timezone offsets. Instead, Validata reads both values as literal strings and compares them as exact strings. This guarantees deterministic comparison even when timestamps and strings contain different representations.

The following example demonstrates this comparison rule:

Step

Source Timestamp(2) with TZ

Target VARCHAR(100)

Original value

2025-11-28 12:23:29.12+00

"2025-11-28 12:23:29.12 UTC"

Conversion rule

Literal string

Literal string

Converted value

"2025-11-28 12:23:29.12+00"

"2025-11-28 12:23:29.12 UTC"

Result

Out-Of-Sync

Because the literal strings differ ("+00" versus "UTC"), Validata correctly reports the comparison as Out-Of-Sync.

Boolean datatype comparisons

Validata applies specific normalization rules when comparing boolean datatypes, including native boolean types and pseudo-boolean numeric types.

Native boolean columns

When both the source and the target contain native boolean datatypes, different systems may store or display boolean values in different textual forms such as t/f (PostgreSQL), TRUE/FALSE (Snowflake, BigQuery, SQL Server), or 1/0 (some bit/boolean implementations). To ensure consistent comparisons, Validata normalizes all native boolean values into a unified canonical representation of "true" or "false". All boolean inputs, regardless of internal or textual representation, are mapped to these lowercase standard strings before comparison.

The following example demonstrates this comparison rule:

Step

Source boolean

Target BOOLEAN

Original value

t

TRUE

Normalization rule

Map to canonical "true"/"false"

Map to canonical "true"/"false"

Normalized value

"true"

"true"

Result

In-Sync

Native boolean columns compared with pseudo-boolean numeric columns

Some databases do not support a native boolean datatype. Instead, they store boolean values using bit, tinyint, or numeric columns, typically representing TRUE as 1 and FALSE as 0. When Validata compares a native boolean datatype with a pseudo-boolean numeric datatype, it converts the native boolean value into its numeric representation (true becomes 1, false becomes 0). After conversion, Validata compares both sides as stringified numeric values, ensuring deterministic and cross-platform correctness.

The following example demonstrates this comparison rule:

Step

Source BOOLEAN

Target bit/tinyint

Original value

TRUE

1

Conversion rule

Convert boolean to 1/0

Numeric stays numeric

Converted value

"1"

"1"

Result

In-Sync

Native boolean columns compared with non-boolean/non-pseudo-boolean columns

When the source contains a native boolean datatype (such as PostgreSQL boolean or Snowflake BOOLEAN) and the target contains any datatype that is not a boolean or pseudo-boolean, Validata treats this as heterogeneous mapping. In heterogeneous mappings, Validata does not convert the boolean into numeric 1/0 (that only applies when the target is pseudo-boolean) and does not attempt boolean normalization on the target. Instead, Validata reads both values as literal strings and performs an exact string-to-string comparison. Native booleans are still normalized to "true" or "false" before comparison, while non-boolean datatypes are returned as their literal string form.

The following example demonstrates this comparison rule:

Step

Source boolean

Target BIGINT

Original value

t

225542

Conversion rule

Normalize boolean to "true"

Literal string

Converted value

"true"

"225542"

Result

Out-Of-Sync

Other datatype comparisons

For any database type not covered by the rules described in this topic, Validata reads and compares them as strings. This ensures deterministic comparison behavior for all datatypes, even when specific normalization rules do not apply.