Rate Units Are Valid
formatmediumThe price/rate units must be a valid FERC unit (e.g. $/MWH, $/MW-DAY, $/KW-MO, FLAT RATE).
v1.0.0by dqhub0 downloads0 (0)
Parameters
column_namestringrequiredThe column containing email addresses
thresholdfloatdefault: 0.99Minimum fraction of valid emails (0.0 to 1.0)
Install
soda
checks for {{table_name}}:
- invalid_percent({{column_name}}) < {{(1 - threshold) * 100}}:
valid regex: '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'dbt
{% test valid_email(model, column_name) %}
select {{ column_name }}
from {{ model }}
where {{ column_name }} not regexp '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$'
{% endtest %}sql
SELECT COUNT(*) as total,
SUM(CASE WHEN {{column_name}} REGEXP
'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$'
THEN 1 ELSE 0 END) as valid
FROM {{table_name}}Great Expectations
{
"expectation_type": "expect_column_values_to_match_regex",
"kwargs": {
"column": "{{column_name}}",
"regex": "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$",
"mostly": {{threshold}}
}
}spark
from pyspark.sql.functions import col
pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
invalid = df.filter(~col("{{column_name}}").rlike(pattern)).count()Test Data
Passing Examples
| id | value |
|---|---|
| 1 | alice@example.com |
| 2 | bob.smith@company.co.uk |
| 3 | charlie+tag@domain.org |
Failing Examples
| id | value |
|---|---|
| 1 | not-an-email |
| 2 | @missing-local.com |
| 3 | spaces in@email.com |
CLI
Terminal
npx dqhub install ferc-eqr-reporting-eqr-rate-units-valid --format soda --table YOUR_TABLE
npx dqhub install ferc-eqr-reporting-eqr-rate-units-valid --format dbt --model YOUR_MODEL
npx dqhub install ferc-eqr-reporting-eqr-rate-units-valid --format sql --dialect snowflake