Back to packs

BCBS 239 Risk Data Aggregation

BCBS_239free

Data quality rules aligned to the 14 principles of BCBS 239 for risk data aggregation and reporting in banking.

23 rules 1272 downloads4.6 avg (80)
bcbs-239bankingrisk-databaselregulatoryrisk-reporting
4.6(80 ratings)

Sign in to rate this pack

Test this pack with your data

Download the template, fill in your data, and see quality results instantly.

Test This Pack

Download & Install

Choose your tool — get a ready-to-run file

Run this on your data? Upload your CSV — we'll auto-map the columns, validate, and report the bad rows.Test my data
Or use the CLI
$ npx dqhub install bcbs239-risk-data --format soda --table YOUR_TABLE

About this pack

Rule pack for banks and financial institutions implementing BCBS 239 (Basel Committee Principles for Effective Risk Data Aggregation and Risk Reporting). Covers the 6 data quality principles: - Principle 2 (Data Architecture): Standardized reference data, unique identifiers - Principle 3 (Accuracy): Format validation, range checks, cross-system reconciliation - Principle 4 (Timeliness): Data freshness, ingestion delays, staleness detection - Principle 5 (Completeness): Null checks, required field validation, coverage metrics - Principle 6 (Adaptability): Schema validation, cardinality monitoring - Principle 7 (Accuracy of Reporting): Statistical stability, distribution checks Each rule includes the specific BCBS 239 principle reference for audit documentation.

Sources & References

BCBS_239 — Principle 2

Data architecture and infrastructure must use standardized reference data

GDPR — Article 44-49

Cross-border transfer validation requires valid country identification

BCBS_239 — Principle 3

Accuracy of financial reference data including account identifiers

What's included

4format rules
4statistical rules
3completeness rules
3range rules
3uniqueness rules
3freshness rules
2referential integrity rules
1volume rules

Checks included (23)

ISO Currency Code Validation(currency_code)

Validates that values are valid ISO 4217 currency codes (e.g., USD, EUR, GBP, JPY)

ISO Country Code Validation(country_code)

Validates that values are valid ISO 3166-1 alpha-2 country codes (e.g., US, GB, DE, FR)

IBAN Format Validation(iban)

Validates International Bank Account Number format — 2-letter country code, 2 check digits, and up to 30 alphanumeric characters

Valid Email Format(email)

Validates that values conform to a simplified RFC 5322 email address format. Checks for a local part containing alphanumeric characters and common special characters, an @ symbol, and a domain with at least one dot-separated label.

Mean In Range

Asserts that the arithmetic mean of a numeric column falls within an expected range. Detects data drift, calculation errors, or upstream changes that shift the central tendency of key metrics.

Standard Deviation Stable

Asserts that the standard deviation of a numeric column has not changed more than the specified percentage from a known baseline. Detects changes in data variability that may indicate corrupted data, changed source systems, or process failures.

Cardinality Check

Asserts that the number of distinct values in a column falls within an expected range. Detects issues such as collapsed categories (too few distinct values), data explosion (too many), or enum drift from upstream changes.

Distribution Check

Asserts that the value distribution of a categorical column matches expected proportions within a given tolerance. For example, verifying that 60% of records are type A, 30% type B, and 10% type C. Detects upstream labeling changes, data skew, or ingestion bias.

Column Not Null

Asserts that a specified column contains no null values. This is the most fundamental completeness check — every row must have a value present in the target column.

Column Completeness Threshold

Asserts that a column meets a minimum completeness threshold, measured as the percentage of non-null values. Useful when some nulls are acceptable but the overall population rate must stay above a defined level (e.g., 95%).

Required Fields for Status

Asserts that when a status column has a specific value (e.g., 'active'), a set of required fields must all be populated (non-null). Enforces lifecycle-based data completeness rules where later stages demand richer data.

Non-Negative Values

Validates that a numeric column contains no negative values. Common for quantities, counts, amounts, durations, and other measures that should never be negative.

Value In Range

Validates that all values in a numeric column fall within a specified minimum and maximum range (inclusive). Catches data entry errors, unit conversion issues, and out-of-bounds values.

Amount Limit

Validates that financial amounts do not exceed a configurable limit. Catches data entry errors, currency conversion issues, and fraudulent transactions. Supports both upper and optional lower bounds.

Column Unique

Validates that all non-null values in a specified column are unique. Useful for natural keys, email addresses, identifiers, and any column where duplicates indicate a data quality issue.

Composite Unique

Validates that the combination of specified columns forms a unique key. Ensures no two rows share the same values across all listed columns, enforcing composite key integrity.

Primary Key Valid

Validates that a column qualifies as a valid primary key by ensuring all values are both unique and not null. Combines uniqueness and completeness checks into a single rule.

Table Freshness

Asserts that a table has been updated within the specified number of hours. Uses the table's metadata (last modified timestamp) or a designated timestamp column to verify data is fresh and pipelines are running on schedule.

Column Max Age

Asserts that the most recent value in a date/timestamp column is within the specified number of hours from the current time. Useful for verifying that new data is arriving as expected in date-partitioned or event-driven tables.

Ingestion Delay

Asserts that the time difference between the source event timestamp and the load/ingestion timestamp is within the defined SLA. Detects pipeline lag, backpressure, or ingestion failures that cause data to arrive late.

Foreign Key Valid

Validates that all non-null values in a foreign key column exist in the referenced parent table's primary key column. Detects orphaned references that break referential integrity.

Cross-Table Count Match

Validates that row counts or aggregated values match between related tables. For example, the number of order line items should match the item_count on the order header, or the sum of transaction amounts should match the account balance.

Schema Column Count

Asserts that a table has the expected number of columns. Detects unintended schema changes such as dropped columns, added columns from upstream migrations, or schema drift between environments.