Skip to main content

quality_monitors

Creates, updates, deletes, gets or lists a quality_monitors resource.

Overview

Namequality_monitors
TypeResource
Iddatabricks_workspace.catalog.quality_monitors

Fields

The following fields are returned by SELECT queries:

NameDatatypeDescription
dashboard_idstring[Create:ERR Update:OPT] Id of dashboard that visualizes the computed metrics. This can be empty if the monitor is in PENDING state.
baseline_table_namestring[Create:OPT Update:OPT] Baseline table name. Baseline data is used to compute drift from the data in the monitored `table_name`. The baseline table and the monitored table shall have the same schema.
drift_metrics_table_namestring[Create:ERR Update:IGN] Table that stores drift metrics data. Format: `catalog.schema.table_name`.
output_schema_namestring
profile_metrics_table_namestring[Create:ERR Update:IGN] Table that stores profile metrics data. Format: `catalog.schema.table_name`.
table_namestring[Create:ERR Update:IGN] UC table to monitor. Format: `catalog.schema.table_name`
assets_dirstring[Create:REQ Update:IGN] Field for specifying the absolute path to a custom directory to store data-monitoring assets. Normally prepopulated to a default user location via UI and Python APIs.
custom_metricsarray[Create:OPT Update:OPT] Custom metrics.
data_classification_configobject[Create:OPT Update:OPT] Data classification related config.
inference_logobject
latest_monitor_failure_msgstring[Create:ERR Update:IGN] The latest error message for a monitor failure.
monitor_versioninteger[Create:ERR Update:IGN] Represents the current monitor configuration version in use. The version will be represented in a numeric fashion (1,2,3...). The field has flexibility to take on negative values, which can indicate corrupted monitor_version numbers.
notificationsobject[Create:OPT Update:OPT] Field for specifying notification settings.
scheduleobject[Create:OPT Update:OPT] The monitor schedule.
slicing_exprsarray[Create:OPT Update:OPT] List of column expressions to slice data with for targeted analysis. The data is grouped by each expression independently, resulting in a separate slice for each predicate and its complements. For example `slicing_exprs=[“col_1”, “col_2 > 10”]` will generate the following slices: two slices for `col_2 > 10` (True and False), and one slice per unique value in `col1`. For high-cardinality columns, only the top 100 unique values by frequency will generate slices.
snapshotobjectConfiguration for monitoring snapshot tables.
statusstring[Create:ERR Update:IGN] The monitor status. (MONITOR_STATUS_ACTIVE, MONITOR_STATUS_DELETE_PENDING, MONITOR_STATUS_ERROR, MONITOR_STATUS_FAILED, MONITOR_STATUS_PENDING)
time_seriesobjectConfiguration for monitoring time series tables.

Methods

The following methods are available for this resource:

NameAccessible byRequired ParamsOptional ParamsDescription
getselecttable_name, deployment_name[DEPRECATED] Gets a monitor for the specified table. Use Data Quality Monitors API instead
createinserttable_name, deployment_name, output_schema_name, assets_dir[DEPRECATED] Creates a new monitor for the specified table. Use Data Quality Monitors API instead
updatereplacetable_name, deployment_name, output_schema_name[DEPRECATED] Updates a monitor for the specified table. Use Data Quality Monitors API instead
deletedeletetable_name, deployment_name[DEPRECATED] Deletes a monitor for the specified table. Use Data Quality Monitors API instead
cancel_refreshexectable_name, refresh_id, deployment_name[DEPRECATED] Cancels an already-initiated refresh job. Use Data Quality Monitors API instead
regenerate_dashboardexectable_name, deployment_name[DEPRECATED] Regenerates the monitoring dashboard for the specified table. Use Data Quality Monitors

Parameters

Parameters can be passed in the WHERE clause of a query. Check the Methods section to see which parameters are required or optional for each operation.

NameDatatypeDescription
deployment_namestringThe Databricks Workspace Deployment Name (default: dbc-abcd0123-a1bc)
refresh_idintegerint
table_namestringUC table name in format catalog.schema.table_name. This field corresponds to the {full_table_name_arg} arg in the endpoint path.

SELECT examples

[DEPRECATED] Gets a monitor for the specified table. Use Data Quality Monitors API instead

SELECT
dashboard_id,
baseline_table_name,
drift_metrics_table_name,
output_schema_name,
profile_metrics_table_name,
table_name,
assets_dir,
custom_metrics,
data_classification_config,
inference_log,
latest_monitor_failure_msg,
monitor_version,
notifications,
schedule,
slicing_exprs,
snapshot,
status,
time_series
FROM databricks_workspace.catalog.quality_monitors
WHERE table_name = '{{ table_name }}' -- required
AND deployment_name = '{{ deployment_name }}' -- required
;

INSERT examples

[DEPRECATED] Creates a new monitor for the specified table. Use Data Quality Monitors API instead

INSERT INTO databricks_workspace.catalog.quality_monitors (
output_schema_name,
assets_dir,
baseline_table_name,
custom_metrics,
data_classification_config,
inference_log,
latest_monitor_failure_msg,
notifications,
schedule,
skip_builtin_dashboard,
slicing_exprs,
snapshot,
time_series,
warehouse_id,
table_name,
deployment_name
)
SELECT
'{{ output_schema_name }}' /* required */,
'{{ assets_dir }}' /* required */,
'{{ baseline_table_name }}',
'{{ custom_metrics }}',
'{{ data_classification_config }}',
'{{ inference_log }}',
'{{ latest_monitor_failure_msg }}',
'{{ notifications }}',
'{{ schedule }}',
{{ skip_builtin_dashboard }},
'{{ slicing_exprs }}',
'{{ snapshot }}',
'{{ time_series }}',
'{{ warehouse_id }}',
'{{ table_name }}',
'{{ deployment_name }}'
RETURNING
dashboard_id,
baseline_table_name,
drift_metrics_table_name,
output_schema_name,
profile_metrics_table_name,
table_name,
assets_dir,
custom_metrics,
data_classification_config,
inference_log,
latest_monitor_failure_msg,
monitor_version,
notifications,
schedule,
slicing_exprs,
snapshot,
status,
time_series
;

REPLACE examples

[DEPRECATED] Updates a monitor for the specified table. Use Data Quality Monitors API instead

REPLACE databricks_workspace.catalog.quality_monitors
SET
output_schema_name = '{{ output_schema_name }}',
baseline_table_name = '{{ baseline_table_name }}',
custom_metrics = '{{ custom_metrics }}',
dashboard_id = '{{ dashboard_id }}',
data_classification_config = '{{ data_classification_config }}',
inference_log = '{{ inference_log }}',
latest_monitor_failure_msg = '{{ latest_monitor_failure_msg }}',
notifications = '{{ notifications }}',
schedule = '{{ schedule }}',
slicing_exprs = '{{ slicing_exprs }}',
snapshot = '{{ snapshot }}',
time_series = '{{ time_series }}'
WHERE
table_name = '{{ table_name }}' --required
AND deployment_name = '{{ deployment_name }}' --required
AND output_schema_name = '{{ output_schema_name }}' --required
RETURNING
dashboard_id,
baseline_table_name,
drift_metrics_table_name,
output_schema_name,
profile_metrics_table_name,
table_name,
assets_dir,
custom_metrics,
data_classification_config,
inference_log,
latest_monitor_failure_msg,
monitor_version,
notifications,
schedule,
slicing_exprs,
snapshot,
status,
time_series;

DELETE examples

[DEPRECATED] Deletes a monitor for the specified table. Use Data Quality Monitors API instead

DELETE FROM databricks_workspace.catalog.quality_monitors
WHERE table_name = '{{ table_name }}' --required
AND deployment_name = '{{ deployment_name }}' --required
;

Lifecycle Methods

[DEPRECATED] Cancels an already-initiated refresh job. Use Data Quality Monitors API instead

EXEC databricks_workspace.catalog.quality_monitors.cancel_refresh 
@table_name='{{ table_name }}' --required,
@refresh_id='{{ refresh_id }}' --required,
@deployment_name='{{ deployment_name }}' --required
;