Skip to content

SQL Core

sumeh.engines.sql_core

SQL Core Engine - SQL-based validation for databases.

Translates validation rules into optimized SQL queries using SQLGlot. Supports multiple dialects: duckdb, postgres, spark, bigquery, etc.

Example

from sumeh import sql_core

rules = [ ... {"check_type": "is_complete", "field": "email"}, ... {"check_type": "has_mean", "field": "salary", "value": 50000} ... ]

sql, rule_ids = sql_core.compile_rules_to_sql( ... rules=rules, ... table_name="users", ... dialect="duckdb", ... global_filter="date = '2024-01-15'" ... ) print(sql)

Execute with your own DB client

result = conn.execute(sql).fetchone() report = sql_core.validate_results(result, rule_ids, rules)

compile_rules_to_sql

compile_rules_to_sql(
    rules: List,
    table_name: str,
    dialect: str = None,
    global_filter: Optional[str] = None,
) -> Tuple[str, List[str]]

Compiles rules into a single optimized SQL query.

Parameters:

Name Type Description Default
rules List

List of RuleDef objects

required
table_name str

Target table name

required
dialect str

Target SQL dialect (duckdb, spark, postgres, bigquery)

None
global_filter Optional[str]

Optional SQL WHERE clause (e.g., "date = '2024-01-01'")

None

Returns:

Type Description
Tuple[str, List[str]]

(sql_string, list_of_rule_ids_in_order)

validate_results

validate_results(
    metrics_row: Tuple[Any, ...],
    rule_ids: List[str],
    rules: List[RuleDefinition],
    total_rows: int = 0,
) -> List[ValidationResult]

Maps the raw metrics row from SQL execution back to Rule Constraints. Wraps raw SQL values into MetricResult objects to satisfy Constraint interface.