Back to packs

Pharmaceutical & Clinical Trial Data

FDA_21CFR11free

Validate drug codes, CDISC/SDTM clinical trial data, adverse events, and GMP batch records against FDA and ICH standards.

12 rules 1752 downloads4.2 avg (167)
pharmafdacdiscsdtmndcclinical-trialgmpadverse-event
4.2(167 ratings)

Sign in to rate this pack

Test this pack with your data

Download the template, fill in your data, and see quality results instantly.

Test This Pack

Download & Install

Choose your tool — get a ready-to-run file

Run this on your data? Upload your CSV — we'll auto-map the columns, validate, and report the bad rows.Test my data
Or use the CLI
$ npx dqhub install pharma-clinical-data --format soda --table YOUR_TABLE

About this pack

Data quality rules for pharmaceutical and life sciences organizations. Covers: - NDC (National Drug Code) format validation (10-digit and 11-digit HIPAA) - CDISC/SDTM clinical trial data: domain codes, sex values, ISO 8601 dates - Adverse event reporting: severity and outcome controlled terminology (ICH E2B) - GMP batch/lot number format validation (FDA 21 CFR 211) Based on freely available FDA, CDISC, and ICH specifications.

Sources & References

The NDC is a unique 10-digit, 3-segment identifier assigned to each drug product listed under the Federal Food, Drug, and Cosmetic Act

HIPAA — 45 CFR 162.1002

NDC is used as a standard medical data code set for reporting drug products in HIPAA-covered transactions

SDTM domain codes are defined by CDISC and required for FDA electronic submissions

FDA — FDA Study Data Technical Conformance Guide

FDA requires SDTM-formatted datasets using standard domain codes for NDA/BLA electronic submissions

The SEX variable in SDTM DM domain is bound to CDISC CT codelist C66731

ICH — ICH E6(R2) Good Clinical Practice

Clinical trial data must use standardized date formats for regulatory submissions

ICH — ICH E2B(R3) Individual Case Safety Report

Outcome of reaction/event at the time of last observation must use the defined controlled terminology

Batch production and control records must include a distinctive batch or lot number per 21 CFR 211.188

EU GMP — EudraLex Volume 4 Annex 11

Batch numbering must follow a defined system to ensure unique identification of each batch

What's included

9format rules
2completeness rules
1uniqueness rules

Checks included (12)

National Drug Code 10-Digit Format(ndc_code)

Validates that values conform to the FDA National Drug Code (NDC) 10-digit format with hyphens. The NDC uniquely identifies a drug product and supports three segment patterns: 4-4-2 (labeler-product-package), 5-3-2, and 5-4-1. All three formats are accepted by the FDA NDC Directory.

NDC 11-Digit HIPAA Format(ndc_code)

Validates that values conform to the 11-digit zero-padded NDC format required for HIPAA billing transactions. The HIPAA standard normalizes all NDC codes to an 5-4-2 segment format by zero-padding each segment to a fixed width. This format is mandatory for electronic pharmacy claims and billing systems.

CDISC SDTM Domain Code(domain)

Validates that values are valid CDISC SDTM domain abbreviations. Each SDTM domain is a two-character code representing a specific category of clinical trial data (e.g., DM for Demographics, AE for Adverse Events, LB for Laboratory Test Results). This rule checks against the standard domain codes defined in the SDTM Implementation Guide.

CDISC SDTM Sex Controlled Terminology(sex)

Validates that values conform to CDISC SDTM controlled terminology for the SEX variable in the Demographics (DM) domain. Accepted values are M (Male), F (Female), U (Unknown), and UNDIFFERENTIATED. Values are case-sensitive uppercase per CDISC controlled terminology standards.

CDISC SDTM ISO 8601 Date Format(study_start_date)

Validates that date/datetime values conform to the ISO 8601 format required by CDISC SDTM. Partial dates are permitted as SDTM allows incomplete dates (e.g., year only, year-month). Supported patterns include YYYY, YYYY-MM, YYYY-MM-DD, YYYY-MM-DDThh, YYYY-MM-DDThh:mm, and YYYY-MM-DDThh:mm:ss.

Adverse Event Severity Controlled Terminology(severity)

Validates that adverse event severity values conform to CDISC controlled terminology. The severity of an adverse event is classified as MILD, MODERATE, or SEVERE per ICH E6 guidelines and CDISC SDTM controlled terminology for the AESEV variable.

Adverse Event Outcome Controlled Terminology(outcome)

Validates that adverse event outcome values conform to ICH E2B(R3) controlled terminology. The outcome describes the status of the patient at the time of last observation and is required for Individual Case Safety Reports (ICSRs) submitted to regulatory authorities.

Pharmaceutical Batch/Lot Number Format(batch_number)

Validates that batch or lot numbers conform to standard pharmaceutical manufacturing format. Batch numbers must be alphanumeric (uppercase letters, digits, and hyphens only) and between 5 and 20 characters in length, as required by FDA 21 CFR Part 211 for Current Good Manufacturing Practice.

Valid Date String Format(event_date)

Validates that date string values match the expected format. Supports configurable formats including YYYY-MM-DD (ISO 8601), MM/DD/YYYY, DD/MM/YYYY, YYYY/MM/DD, and DD-Mon-YYYY. Validates month (01-12), day (01-31), and reasonable year ranges.

Column Not Null

Asserts that a specified column contains no null values. This is the most fundamental completeness check — every row must have a value present in the target column.

Column Completeness Threshold

Asserts that a column meets a minimum completeness threshold, measured as the percentage of non-null values. Useful when some nulls are acceptable but the overall population rate must stay above a defined level (e.g., 95%).

Column Unique

Validates that all non-null values in a specified column are unique. Useful for natural keys, email addresses, identifiers, and any column where duplicates indicate a data quality issue.