Skip to content

CHANGELOG

v2.0.0 (2026-03-09)

Bug Fixes

  • pandas: Resolve indexing mismatch when filling empty error lists (ea89997)

  • sql: Ensure proper table identifier formatting and import structure (cb2de80)

Build System

  • deps: Update Python version to 3.10 and bump dependencies (9f9d32c)

Chores

  • Clean up pyproject.toml dependencies and extras (571aa3b)

  • Remove legacy monolithic engine (5c96bcc)

  • Remove legacy monolithic engine files (a21b217)

  • cleanup: Remove empty baseline and sinks modules (53fffbf)

  • tests: Remove outdated v1.3.0 test suite (d7b2f21)

Code Style

  • Apply consistent code formatting across all modules (7db9778)

  • Fix formatting and imports across sink modules (aac1e9f)

  • Fix import ordering and remove unused imports across codebase (01d6c59)

  • Format code with consistent imports and line breaks (cbff8e0)

  • Normalize quotes and formatting across codebase (bbd15a3)

Documentation

  • Update module docstrings and implement lazy engine loading (85aafaa)

  • init: Shorten top-level engine exports comment (30d5173)

Features

  • Add auto-profiler, schema validation, and engine capabilities (6a47750)

  • Add AWS Athena SQL-based validation engine (7a2b018)

  • Add CSV rule loader and SQL generators (32bee27)

  • Add DataFrame accessors to ValidationReport (5c426c5)

  • Add distributed Dask engine with lazy evaluation (976f2f3)

  • Add get_validation_sql helper to BigQuery and DuckDB engines (f304dc3)

  • Add OpenMetadata sink with SinkProtocol (fe4a940)

  • Add PyFlink streaming validation engine (e62d2d5)

  • Add PySpark engine with pure Column API (zero UDFs) (f488d47)

  • Add rule validation and metadata preservation (0722918)

  • Add Snowflake engine with pure SQL validation (5c5d416)

  • Add SQL generator for Flink validation queries (15df523)

  • Add SQLCore-based BigQuery engine (00b6b23)

  • Add Trino, Redshift, and Doris SQL engines (13a3d3e)

  • Complete analyzers with all date, comparison, and aggregation rules (02616fe)

  • Complete README rewrite and project restructure for v2.0 release (6ff4d39)

  • Expand PyFlink validation engine to full feature parity (c5840d4)

  • Implement DuckDB engine with SQLCore validation (8aef6a7)

  • duckdb: Add data bifurcation and standardize code formatting (b64bbc1)

  • exporters: Add IExporter protocol for pure metadata formatting (4866921)

  • init: Replace dynamic engine loading with explicit imports for better IDE support (cfc3859)

  • polars: Add Polars engine with optimized bulk error aggregation (6b337ef)

  • sumeh: V2.0 rewrite with analyzer/constraint architecture (cc05b17)

Performance Improvements

  • Optimize pandas and polars engines with bulk error aggregation (bba5d81)

Refactoring

  • Remove OOM-prone ID collection from analyzers (d5b6401)

  • Simplify Ray Data engine implementation (4e753d9)

  • Split CLI into modular command files (29253e3)

  • core: Consolidate protocols and remove dead code (ef6b43e)

  • core: Rename RuleDef to RuleDefinition for consistency (c54432c)

  • core: Reorganize core module structure for better maintainability (79a3298)

  • engines: Simplify engine package exports (deb9c45)

  • models: Remove duplicate SinkResult class (a8579bb)

  • polars: Use consistent timestamp and simplify error aggregation (c57fd5e)

  • profiler: Clean up DataProfiler with better docs and safety checks (3ba2565)

v1.3.0 (2025-10-16)

Bug Fixes

Code Style

  • Improve readability with standardized line breaks and spacing (a220071)

  • Standardize and clean up import ordering (99cc1ae)

Continuous Integration

  • Simplify Poetry install by using --all-extras in publish workflow (a8d59b3)

Features

  • Add full BigQuery table-level validation support and unify rule model across engines (af58f5c)

  • Enhance DuckDB detection and refactor Pandas date checks (39a64ca)

  • Enhance rule parsing and standardize rule usage in engines (945ae8b)

  • Refactor validation engine to use RuleDef model and fix ambiguity issues (808ff22)

  • Standardize aggregation checks and implement multi-level validation (bc56830)

  • Unify table-level validation engine interface across all backends (e1234fa)

  • core: Implement Dispatcher pattern for core modules (13a4349)

  • duckdb: Enhance validation dispatchers, add robust error handling & input checks** (82c4274)

Refactoring

  • Clean up and organize imports across core modules and engines (7953b20)

  • Introduce RuleDef model and registry for configuration (90522e5)

  • Remove obsolete extract_params test and align test suite with current codebase_ (5bfc07b)

  • Standardize code formatting and improve error handling in BigQuery engine (20c994a)

  • Unify and modernize configuration dispatchers with clear, consistent API (8915815)

  • Unify date validation aliases across all engines for consistency (7becdd5)

  • cli: Migrate CLI implementation from argparse to Typer (e161748)

  • pyspark: Standardize validation functions and remove legacy logic (fc03b78)

v1.2.0 (2025-10-09)

Chores

  • deps: Update AWS, caching, and core dependencies (e6821e0)

Documentation

  • Show private members in MkDocs API documentation (3f447e5)

Features

  • bigquery: Implement native Data Quality validation and summarization (eeaf615)

  • bigquery: Rewrite validation to use 100% SQLGlot and improve docs (5ad0d7f)

v1.1.0 (2025-10-08)

Features

  • schema: Decouple schema extraction and improve validation output (852c36b)

Refactoring

  • core, duckdb: Minor cleanup and improved schema error formatting (cfbb695)

v1.0.1 (2025-10-08)

Bug Fixes

  • Correctly parse field lists and handle complex string inputs (66a5a39)

v1.0.0 (2025-10-08)

Bug Fixes

  • Sync version numbers with latest release tag (b65b420)

v1.0.0-rc.1 (2025-10-07)

Bug Fixes

  • engines: Correct inverse logic for comparison validation functions (64dd3da)

Build System

  • Update pyproject.toml with complete metadata (03f4fd2)

Continuous Integration

  • Adopt Trusted Publishers for PyPI deployment and refactor release flow (e507717)

  • Fix on ci/cd deployment (420454f)

  • config: Add python-semantic-release configuration (95b3113)

  • workflow: Configure conditional PyPI publishing for releases (72f3bb6)

Documentation

  • Improve configuration examples and workflow clarity (bf09a5f)

  • Update documentation structure following module refactoring (f938382)

Features

  • Add Schema Validation feature and various data source support (4415c92)

  • Centralized schema definition using Schema Registry (53ee185)

  • Implement interactive Streamlit dashboard for validation results (7c9804a)

  • Introduce Databricks rule source and refine configuration methods (03ef55c)

  • ci: Major package refactoring, automate PyPI publishing, and enhance SQL connections (69bd9c7)

  • cli: Add SQL DDL generation for 8 database dialects (82ca12c)

  • dashboard: Rework Streamlit dashboard with advanced visuals and filters (64dd3da)

Refactoring

  • General code cleanup and API simplification (99368aa)

  • Make schema lookup flexible and enhance security checks (ee2b41d)

  • core, cli: Introduce core utility modules and prepare for 'validate' command (7911465)

  • core, config: Standardize config/schema API and enforce required parameters (75614bc)

v0.3.0 (2025-05-16)

Bug Fixes

  • dask_engine: Invert validation logic to flag non-compliant records (2a76fe7)

Code Style

  • Apply code formatting and cleanup across core and engine files (dd0a0ad)

  • Clean up whitespace and formatting in test files (aa47a97)

Documentation

  • Complete Pandas engine docstrings and enhance core module documentation (65d5110)

  • Enhance documentation and reorganize validation rules (0873ca3)

  • polars_engine: Add comprehensive docstrings for data quality functions (dcc93c5)

Features

  • Add 'is_in' and 'not_in' rule aliases to engines (b522cb0)

  • Add comprehensive date and numeric validation functions to pandas engine (c23d513)

  • Add date and numeric validation functions to Polars engine (2675e84)

  • Improve date and numeric validation rules in DuckDB engine (6de088c)

  • dask: Implement numeric threshold and detailed date/weekday validation rules (522d332)

  • duckdb: Implement numeric threshold and detailed date/weekday validation rules (8112e96)

  • pandas: Add new engine for Pandas DataFrames with comprehensive rule support (d31234b)

  • pyspark: Implement numeric threshold and detailed date validation rules (073c0ad)

v0.2.6 (2025-05-16)

Documentation

  • Add docstrings for date validation rules in Dask and DuckDB engines (42cb80a)

Features

  • dask: Implement date validation rules and add dedicated tests (0a719d4)

  • duckdb: Implement date and additional validation rules (d45afc8)

  • polars: Implement multiple validation rules and enhance documentation (fc83ae6)

v0.2.5 (2025-05-16)

Documentation

  • Update README with logo path and completed tasks (85fcc94)

v0.2.4 (2025-05-16)

Chores

Features

  • Add quickstart guide and list supported validation rules (d307c85)

v0.2.0 (2025-04-29)

  • Initial Release