data_quality
Creates, updates, deletes, gets or lists a data_quality resource.
Overview
| Name | data_quality |
| Type | Resource |
| Id | databricks_workspace.dataquality.data_quality |
Fields
The following fields are returned by SELECT queries:
- get
- list
| Name | Datatype | Description |
|---|---|---|
object_id | string | The UUID of the request object. It is `schema_id` for `schema`, and `table_id` for `table`. Find the `schema_id` from either: 1. The [schema_id] of the `Schemas` resource. 2. In [Catalog Explorer] > select the `schema` > go to the `Details` tab > the `Schema ID` field. Find the `table_id` from either: 1. The [table_id] of the `Tables` resource. 2. In [Catalog Explorer] > select the `table` > go to the `Details` tab > the `Table ID` field. [Catalog Explorer]: https://docs.databricks.com/aws/en/catalog-explorer/ [schema_id]: https://docs.databricks.com/api/workspace/schemas/get#schema_id [table_id]: https://docs.databricks.com/api/workspace/tables/get#table_id |
anomaly_detection_config | object | Anomaly Detection Configuration, applicable to `schema` object types. |
data_profiling_config | object | Data Profiling Configuration, applicable to `table` object types. Exactly one `Analysis Configuration` must be present. |
object_type | string | The type of the monitored object. Can be one of the following: `schema` or `table`. |
| Name | Datatype | Description |
|---|---|---|
object_id | string | The UUID of the request object. It is `schema_id` for `schema`, and `table_id` for `table`. Find the `schema_id` from either: 1. The [schema_id] of the `Schemas` resource. 2. In [Catalog Explorer] > select the `schema` > go to the `Details` tab > the `Schema ID` field. Find the `table_id` from either: 1. The [table_id] of the `Tables` resource. 2. In [Catalog Explorer] > select the `table` > go to the `Details` tab > the `Table ID` field. [Catalog Explorer]: https://docs.databricks.com/aws/en/catalog-explorer/ [schema_id]: https://docs.databricks.com/api/workspace/schemas/get#schema_id [table_id]: https://docs.databricks.com/api/workspace/tables/get#table_id |
anomaly_detection_config | object | Anomaly Detection Configuration, applicable to `schema` object types. |
data_profiling_config | object | Data Profiling Configuration, applicable to `table` object types. Exactly one `Analysis Configuration` must be present. |
object_type | string | The type of the monitored object. Can be one of the following: `schema` or `table`. |
Methods
The following methods are available for this resource:
| Name | Accessible by | Required Params | Optional Params | Description |
|---|---|---|---|---|
get | select | object_type, object_id, deployment_name | Read a data quality monitor on a Unity Catalog object. | |
list | select | deployment_name | page_size, page_token | (Unimplemented) List data quality monitors. |
create | insert | deployment_name, monitor | Create a data quality monitor on a Unity Catalog object. The caller must provide either | |
update | update | object_type, object_id, update_mask, deployment_name, monitor | Update a data quality monitor on Unity Catalog object. | |
delete | delete | object_type, object_id, deployment_name | Delete a data quality monitor on Unity Catalog object. |
Parameters
Parameters can be passed in the WHERE clause of a query. Check the Methods section to see which parameters are required or optional for each operation.
| Name | Datatype | Description |
|---|---|---|
deployment_name | string | The Databricks Workspace Deployment Name (default: dbc-abcd0123-a1bc) |
object_id | string | The UUID of the request object. It is schema_id for schema, and table_id for table. Find the schema_id from either: 1. The [schema_id] of the Schemas resource. 2. In [Catalog Explorer] > select the schema > go to the Details tab > the Schema ID field. Find the table_id from either: 1. The [table_id] of the Tables resource. 2. In [Catalog Explorer] > select the table > go to the Details tab > the Table ID field. [Catalog Explorer]: https://docs.databricks.com/aws/en/catalog-explorer/ [schema_id]: https://docs.databricks.com/api/workspace/schemas/get#schema_id [table_id]: https://docs.databricks.com/api/workspace/tables/get#table_id |
object_type | string | The type of the monitored object. Can be one of the following: schema or table. |
update_mask | string | The field mask to specify which fields to update as a comma-separated list. Example value: data_profiling_config.custom_metrics,data_profiling_config.schedule.quartz_cron_expression |
page_size | integer | :param page_token: str (optional) |
page_token | string |
SELECT examples
- get
- list
Read a data quality monitor on a Unity Catalog object.
SELECT
object_id,
anomaly_detection_config,
data_profiling_config,
object_type
FROM databricks_workspace.dataquality.data_quality
WHERE object_type = '{{ object_type }}' -- required
AND object_id = '{{ object_id }}' -- required
AND deployment_name = '{{ deployment_name }}' -- required
;
(Unimplemented) List data quality monitors.
SELECT
object_id,
anomaly_detection_config,
data_profiling_config,
object_type
FROM databricks_workspace.dataquality.data_quality
WHERE deployment_name = '{{ deployment_name }}' -- required
AND page_size = '{{ page_size }}'
AND page_token = '{{ page_token }}'
;
INSERT examples
- create
- Manifest
Create a data quality monitor on a Unity Catalog object. The caller must provide either
INSERT INTO databricks_workspace.dataquality.data_quality (
monitor,
deployment_name
)
SELECT
'{{ monitor }}' /* required */,
'{{ deployment_name }}'
RETURNING
object_id,
anomaly_detection_config,
data_profiling_config,
object_type
;
# Description fields are for documentation purposes
- name: data_quality
props:
- name: deployment_name
value: "{{ deployment_name }}"
description: Required parameter for the data_quality resource.
- name: monitor
description: |
The monitor to create.
value:
object_type: "{{ object_type }}"
object_id: "{{ object_id }}"
anomaly_detection_config:
excluded_table_full_names:
- "{{ excluded_table_full_names }}"
data_profiling_config:
output_schema_id: "{{ output_schema_id }}"
assets_dir: "{{ assets_dir }}"
baseline_table_name: "{{ baseline_table_name }}"
custom_metrics:
- name: "{{ name }}"
definition: "{{ definition }}"
input_columns: "{{ input_columns }}"
output_data_type: "{{ output_data_type }}"
type: "{{ type }}"
dashboard_id: "{{ dashboard_id }}"
drift_metrics_table_name: "{{ drift_metrics_table_name }}"
effective_warehouse_id: "{{ effective_warehouse_id }}"
inference_log:
problem_type: "{{ problem_type }}"
timestamp_column: "{{ timestamp_column }}"
granularities:
- "{{ granularities }}"
prediction_column: "{{ prediction_column }}"
model_id_column: "{{ model_id_column }}"
label_column: "{{ label_column }}"
latest_monitor_failure_message: "{{ latest_monitor_failure_message }}"
monitor_version: {{ monitor_version }}
monitored_table_name: "{{ monitored_table_name }}"
notification_settings:
on_failure:
email_addresses:
- "{{ email_addresses }}"
profile_metrics_table_name: "{{ profile_metrics_table_name }}"
schedule:
quartz_cron_expression: "{{ quartz_cron_expression }}"
timezone_id: "{{ timezone_id }}"
pause_status: "{{ pause_status }}"
skip_builtin_dashboard: {{ skip_builtin_dashboard }}
slicing_exprs:
- "{{ slicing_exprs }}"
snapshot: "{{ snapshot }}"
status: "{{ status }}"
time_series:
timestamp_column: "{{ timestamp_column }}"
granularities:
- "{{ granularities }}"
warehouse_id: "{{ warehouse_id }}"
UPDATE examples
- update
Update a data quality monitor on Unity Catalog object.
UPDATE databricks_workspace.dataquality.data_quality
SET
monitor = '{{ monitor }}'
WHERE
object_type = '{{ object_type }}' --required
AND object_id = '{{ object_id }}' --required
AND update_mask = '{{ update_mask }}' --required
AND deployment_name = '{{ deployment_name }}' --required
AND monitor = '{{ monitor }}' --required
RETURNING
object_id,
anomaly_detection_config,
data_profiling_config,
object_type;
DELETE examples
- delete
Delete a data quality monitor on Unity Catalog object.
DELETE FROM databricks_workspace.dataquality.data_quality
WHERE object_type = '{{ object_type }}' --required
AND object_id = '{{ object_id }}' --required
AND deployment_name = '{{ deployment_name }}' --required
;