Policy Number Format

formathigh

Validates that policy numbers conform to a standard alphanumeric format between 10 and 20 characters. Accepts uppercase letters, digits, and hyphens. Ensures policy identifiers are consistent for binding, endorsement tracking, and claims linkage.

v1.0.0by dqhub1,177 downloads4.7 (60)

policyinsuranceidentifier

Try This Rule

Parameters

column_namestringrequired

The column containing email addresses

thresholdfloatdefault: 0.99

Minimum fraction of valid emails (0.0 to 1.0)

Install

soda

checks for {{table_name}}:
  - invalid_percent({{column_name}}) < {{(1 - threshold) * 100}}:
      valid regex: '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'

dbt

{% test valid_email(model, column_name) %}
select {{ column_name }}
from {{ model }}
where {{ column_name }} not regexp '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$'
{% endtest %}

sql

SELECT COUNT(*) as total,
  SUM(CASE WHEN {{column_name}} REGEXP
    '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$'
    THEN 1 ELSE 0 END) as valid
FROM {{table_name}}

Great Expectations

{
  "expectation_type": "expect_column_values_to_match_regex",
  "kwargs": {
    "column": "{{column_name}}",
    "regex": "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$",
    "mostly": {{threshold}}
  }
}

spark

from pyspark.sql.functions import col
pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
invalid = df.filter(~col("{{column_name}}").rlike(pattern)).count()

Test Data

Passing Examples

id	value
1	alice@example.com
2	bob.smith@company.co.uk
3	charlie+tag@domain.org

Failing Examples

id	value
1	not-an-email
2	@missing-local.com
3	spaces in@email.com

CLI

Terminal

npx dqhub install policy-number-format --format soda --table YOUR_TABLE
npx dqhub install policy-number-format --format dbt --model YOUR_MODEL
npx dqhub install policy-number-format --format sql --dialect snowflake