SQL Core¶
sumeh.engines.sql_core ¶
SQL Core Engine - SQL-based validation for databases.
Translates validation rules into optimized SQL queries using SQLGlot. Supports multiple dialects: duckdb, postgres, spark, bigquery, etc.
Example
from sumeh import sql_core
rules = [ ... {"check_type": "is_complete", "field": "email"}, ... {"check_type": "has_mean", "field": "salary", "value": 50000} ... ]
sql, rule_ids = sql_core.compile_rules_to_sql( ... rules=rules, ... table_name="users", ... dialect="duckdb", ... global_filter="date = '2024-01-15'" ... ) print(sql)
Execute with your own DB client¶
result = conn.execute(sql).fetchone() report = sql_core.validate_results(result, rule_ids, rules)
compile_rules_to_sql ¶
compile_rules_to_sql(
rules: List,
table_name: str,
dialect: str = None,
global_filter: Optional[str] = None,
) -> Tuple[str, List[str]]
Compiles rules into a single optimized SQL query.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rules
|
List
|
List of RuleDef objects |
required |
table_name
|
str
|
Target table name |
required |
dialect
|
str
|
Target SQL dialect (duckdb, spark, postgres, bigquery) |
None
|
global_filter
|
Optional[str]
|
Optional SQL WHERE clause (e.g., "date = '2024-01-01'") |
None
|
Returns:
| Type | Description |
|---|---|
Tuple[str, List[str]]
|
(sql_string, list_of_rule_ids_in_order) |
validate_results ¶
validate_results(
metrics_row: Tuple[Any, ...],
rule_ids: List[str],
rules: List[RuleDefinition],
total_rows: int = 0,
) -> List[ValidationResult]
Maps the raw metrics row from SQL execution back to Rule Constraints. Wraps raw SQL values into MetricResult objects to satisfy Constraint interface.