Back to rules

CCPA Data Retention Period Limit

freshnesshigh

Validates that personal data records do not exceed their stated retention period from the collection date. Under CPRA, businesses must disclose retention periods and must not retain personal information longer than reasonably necessary for the disclosed purpose. Records past their retention limit must be flagged for deletion.

v1.0.0by dqhub592 downloads3.9 (24)
ccparetentionprivacycompliancecpradata-lifecyclestorage-limitation
Try This Rule

Parameters

column_namestringrequired

The column containing email addresses

thresholdfloatdefault: 0.99

Minimum fraction of valid emails (0.0 to 1.0)

Compliance Mapping

CPRACal. Civ. Code 1798.100(a)(3) — Retention Disclosure

VCDPAVa. Code 59.1-578(A)(2) — Data Minimization

CCPACal. Civ. Code 1798.100 — Right to Know

CPAC.R.S. 6-1-1308(3) — Purpose Limitation

Install

soda
checks for {{table_name}}:
  - invalid_percent({{column_name}}) < {{(1 - threshold) * 100}}:
      valid regex: '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
dbt
{% test valid_email(model, column_name) %}
select {{ column_name }}
from {{ model }}
where {{ column_name }} not regexp '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$'
{% endtest %}
sql
SELECT COUNT(*) as total,
  SUM(CASE WHEN {{column_name}} REGEXP
    '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$'
    THEN 1 ELSE 0 END) as valid
FROM {{table_name}}
Great Expectations
{
  "expectation_type": "expect_column_values_to_match_regex",
  "kwargs": {
    "column": "{{column_name}}",
    "regex": "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$",
    "mostly": {{threshold}}
  }
}
spark
from pyspark.sql.functions import col
pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
invalid = df.filter(~col("{{column_name}}").rlike(pattern)).count()

Test Data

Passing Examples

idvalue
1alice@example.com
2bob.smith@company.co.uk
3charlie+tag@domain.org

Failing Examples

idvalue
1not-an-email
2@missing-local.com
3spaces in@email.com

CLI

Terminal
npx dqhub install ccpa-data-retention-limit --format soda --table YOUR_TABLE
npx dqhub install ccpa-data-retention-limit --format dbt --model YOUR_MODEL
npx dqhub install ccpa-data-retention-limit --format sql --dialect snowflake