clusters
Creates, updates, deletes, gets or lists a clusters
resource.
Overview
Name | clusters |
Type | Resource |
Id | databricks_workspace.compute.clusters |
Fields
The following fields are returned by SELECT
queries:
- list
- get
Name | Datatype | Description |
---|---|---|
cluster_id | string | |
driver_node_type_id | string | |
node_type_id | string | |
spark_context_id | integer | |
cluster_name | string | |
creator_user_name | string | |
autotermination_minutes | integer | |
aws_attributes | object | |
cluster_source | string | |
default_tags | object | |
disk_spec | object | |
driver_instance_source | object | |
enable_elastic_disk | boolean | |
enable_local_disk_encryption | boolean | |
init_scripts_safe_mode | boolean | |
instance_source | object | |
last_state_loss_time | integer | |
num_workers | integer | |
spark_version | string | |
start_time | integer | |
state | string | |
state_message | string | |
terminated_time | integer | |
termination_reason | object |
Name | Datatype | Description |
---|---|---|
cluster_id | string | |
driver_node_type_id | string | |
node_type_id | string | |
spark_context_id | integer | |
cluster_name | string | |
creator_user_name | string | |
autotermination_minutes | integer | |
aws_attributes | object | |
cluster_source | string | |
default_tags | object | |
disk_spec | object | |
driver_instance_source | object | |
enable_elastic_disk | boolean | |
enable_local_disk_encryption | boolean | |
init_scripts_safe_mode | boolean | |
instance_source | object | |
last_state_loss_time | integer | |
num_workers | integer | |
spark_version | string | |
spec | object | |
start_time | integer | |
state | string | |
state_message | string | |
terminated_time | integer | |
termination_reason | object |
Methods
The following methods are available for this resource:
Name | Accessible by | Required Params | Optional Params | Description |
---|---|---|---|---|
list | select | deployment_name | Return information about all pinned and active clusters, and all clusters terminated within the last 30 days. Clusters terminated prior to this period are not included. | |
get | select | deployment_name | Retrieves the information for a cluster given its identifier. Clusters can be described while they are running, or up to 60 days after they are terminated. | |
create | insert | deployment_name | Creates a new Spark cluster. This method will acquire new instances from the cloud provider if necessary. This method is asynchronous; the returned | |
update | update | deployment_name | Updates the configuration of a cluster to match the partial set of attributes and size. Denote which fields to update using the | |
edit | replace | deployment_name | Updates the configuration of a cluster to match the provided attributes and size. A cluster can be updated if it is in a | |
delete | delete | deployment_name | Terminates the Spark cluster with the specified ID. The cluster is removed asynchronously. Once the termination has completed, the cluster will be in a | |
changeowner | exec | deployment_name | Change the owner of the cluster. You must be an admin and the cluster must be terminated to perform this operation. The service principal application ID can be supplied as an argument to | |
permanentdelete | exec | deployment_name | Permanently deletes a Spark cluster. This cluster is terminated and resources are asynchronously removed. | |
pin | exec | deployment_name | Pinning a cluster ensures that the cluster will always be returned by the ListClusters API. Pinning a cluster that is already pinned will have no effect. This API can only be called by workspace admins. | |
resize | exec | deployment_name | Resizes a cluster to have a desired number of workers. This will fail unless the cluster is in a | |
restart | exec | deployment_name | Restarts a Spark cluster with the supplied ID. If the cluster is not currently in a | |
start | exec | deployment_name | Starts a terminated Spark cluster with the supplied ID. This works similar to | |
unpin | exec | deployment_name | Unpinning a cluster will allow the cluster to eventually be removed from the ListClusters API. Unpinning a cluster that is not pinned will have no effect. This API can only be called by workspace admins. |
Parameters
Parameters can be passed in the WHERE
clause of a query. Check the Methods section to see which parameters are required or optional for each operation.
Name | Datatype | Description |
---|---|---|
deployment_name | string | The Databricks Workspace Deployment Name (default: dbc-abcd0123-a1bc) |
SELECT
examples
- list
- get
Return information about all pinned and active clusters, and all clusters terminated within the last 30 days. Clusters terminated prior to this period are not included.
SELECT
cluster_id,
driver_node_type_id,
node_type_id,
spark_context_id,
cluster_name,
creator_user_name,
autotermination_minutes,
aws_attributes,
cluster_source,
default_tags,
disk_spec,
driver_instance_source,
enable_elastic_disk,
enable_local_disk_encryption,
init_scripts_safe_mode,
instance_source,
last_state_loss_time,
num_workers,
spark_version,
start_time,
state,
state_message,
terminated_time,
termination_reason
FROM databricks_workspace.compute.clusters
WHERE deployment_name = '{{ deployment_name }}' -- required;
Retrieves the information for a cluster given its identifier. Clusters can be described while they are running, or up to 60 days after they are terminated.
SELECT
cluster_id,
driver_node_type_id,
node_type_id,
spark_context_id,
cluster_name,
creator_user_name,
autotermination_minutes,
aws_attributes,
cluster_source,
default_tags,
disk_spec,
driver_instance_source,
enable_elastic_disk,
enable_local_disk_encryption,
init_scripts_safe_mode,
instance_source,
last_state_loss_time,
num_workers,
spark_version,
spec,
start_time,
state,
state_message,
terminated_time,
termination_reason
FROM databricks_workspace.compute.clusters
WHERE deployment_name = '{{ deployment_name }}' -- required;
INSERT
examples
- create
- Manifest
Creates a new Spark cluster. This method will acquire new instances from the cloud provider if necessary. This method is asynchronous; the returned
INSERT INTO databricks_workspace.compute.clusters (
data__num_workers,
data__kind,
data__cluster_name,
data__spark_version,
data__use_ml_runtime,
data__is_single_node,
data__node_type_id,
data__driver_node_type_id,
data__ssh_public_keys,
data__autotermination_minutes,
data__enable_elastic_disk,
data__instance_pool_id,
data__policy_id,
data__enable_local_disk_encryption,
data__driver_instance_pool_id,
data__runtime_engine,
data__data_security_mode,
data__single_user_name,
data__apply_policy_default_values,
data__autoscale,
data__spark_conf,
data__aws_attributes,
data__custom_tags,
data__cluster_log_conf,
data__init_scripts,
data__spark_env_vars,
data__workload_type,
data__docker_image,
data__clone_from,
deployment_name
)
SELECT
'{{ num_workers }}',
'{{ kind }}',
'{{ cluster_name }}',
'{{ spark_version }}',
'{{ use_ml_runtime }}',
{{ is_single_node }},
{{ node_type_id }},
'{{ driver_node_type_id }}',
'{{ ssh_public_keys }}',
'{{ autotermination_minutes }}',
'{{ enable_elastic_disk }}',
{{ instance_pool_id }},
'{{ policy_id }}',
'{{ enable_local_disk_encryption }}',
{{ driver_instance_pool_id }},
'{{ runtime_engine }}',
'{{ data_security_mode }}',
'{{ single_user_name }}',
'{{ apply_policy_default_values }}',
'{{ autoscale }}',
'{{ spark_conf }}',
'{{ aws_attributes }}',
'{{ custom_tags }}',
'{{ cluster_log_conf }}',
'{{ init_scripts }}',
'{{ spark_env_vars }}',
'{{ workload_type }}',
'{{ docker_image }}',
'{{ clone_from }}',
'{{ deployment_name }}'
RETURNING
cluster_id
;
# Description fields are for documentation purposes
- name: clusters
props:
- name: deployment_name
value: string
description: Required parameter for the clusters resource.
- name: num_workers
value: int32
- name: kind
value: string
- name: cluster_name
value: string
- name: spark_version
value: required
- name: use_ml_runtime
value: string
- name: is_single_node
value: boolean
- name: node_type_id
value: boolean
- name: driver_node_type_id
value: string
- name: ssh_public_keys
value: string
- name: autotermination_minutes
value: Array of string
- name: enable_elastic_disk
value: int32
- name: instance_pool_id
value: boolean
- name: policy_id
value: string
- name: enable_local_disk_encryption
value: string
- name: driver_instance_pool_id
value: boolean
- name: runtime_engine
value: string
- name: data_security_mode
value: string
- name: single_user_name
value: string
- name: apply_policy_default_values
value: string
- name: autoscale
value: object
- name: spark_conf
value: object
- name: aws_attributes
value: object
- name: custom_tags
value: object
- name: cluster_log_conf
value: object
- name: init_scripts
value: Array of object
- name: spark_env_vars
value: object
- name: workload_type
value: object
- name: docker_image
value: object
- name: clone_from
value: object
UPDATE
examples
- update
Updates the configuration of a cluster to match the partial set of attributes and size. Denote which fields to update using the
UPDATE databricks_workspace.compute.clusters
SET
data__cluster_id = '{{ cluster_id }}',
data__update_mask = '{{ update_mask }}',
data__cluster = '{{ cluster }}'
WHERE
deployment_name = '{{ deployment_name }}' --required;
REPLACE
examples
- edit
Updates the configuration of a cluster to match the provided attributes and size. A cluster can be updated if it is in a
REPLACE databricks_workspace.compute.clusters
SET
data__cluster_id = '{{ cluster_id }}',
data__num_workers = '{{ num_workers }}',
data__kind = '{{ kind }}',
data__cluster_name = '{{ cluster_name }}',
data__spark_version = '{{ spark_version }}',
data__use_ml_runtime = '{{ use_ml_runtime }}',
data__is_single_node = '{{ is_single_node }}',
data__node_type_id = {{ node_type_id }},
data__driver_node_type_id = {{ driver_node_type_id }},
data__ssh_public_keys = '{{ ssh_public_keys }}',
data__autotermination_minutes = '{{ autotermination_minutes }}',
data__enable_elastic_disk = '{{ enable_elastic_disk }}',
data__instance_pool_id = '{{ instance_pool_id }}',
data__policy_id = {{ policy_id }},
data__enable_local_disk_encryption = '{{ enable_local_disk_encryption }}',
data__driver_instance_pool_id = '{{ driver_instance_pool_id }}',
data__runtime_engine = {{ runtime_engine }},
data__data_security_mode = '{{ data_security_mode }}',
data__single_user_name = '{{ single_user_name }}',
data__apply_policy_default_values = '{{ apply_policy_default_values }}',
data__autoscale = '{{ autoscale }}',
data__spark_conf = '{{ spark_conf }}',
data__aws_attributes = '{{ aws_attributes }}',
data__custom_tags = '{{ custom_tags }}',
data__cluster_log_conf = '{{ cluster_log_conf }}',
data__init_scripts = '{{ init_scripts }}',
data__spark_env_vars = '{{ spark_env_vars }}',
data__workload_type = '{{ workload_type }}',
data__docker_image = '{{ docker_image }}'
WHERE
deployment_name = '{{ deployment_name }}' --required;
DELETE
examples
- delete
Terminates the Spark cluster with the specified ID. The cluster is removed asynchronously. Once the termination has completed, the cluster will be in a
DELETE FROM databricks_workspace.compute.clusters
WHERE deployment_name = '{{ deployment_name }}' --required;
Lifecycle Methods
- changeowner
- permanentdelete
- pin
- resize
- restart
- start
- unpin
Change the owner of the cluster. You must be an admin and the cluster must be terminated to perform this operation. The service principal application ID can be supplied as an argument to
EXEC databricks_workspace.compute.clusters.changeowner
@deployment_name='{{ deployment_name }}' --required
@@json=
'{
"cluster_id": "{{ cluster_id }}",
"owner_username": "{{ owner_username }}"
}';
Permanently deletes a Spark cluster. This cluster is terminated and resources are asynchronously removed.
EXEC databricks_workspace.compute.clusters.permanentdelete
@deployment_name='{{ deployment_name }}' --required
@@json=
'{
"cluster_id": "{{ cluster_id }}"
}';
Pinning a cluster ensures that the cluster will always be returned by the ListClusters API. Pinning a cluster that is already pinned will have no effect. This API can only be called by workspace admins.
EXEC databricks_workspace.compute.clusters.pin
@deployment_name='{{ deployment_name }}' --required
@@json=
'{
"cluster_id": "{{ cluster_id }}"
}';
Resizes a cluster to have a desired number of workers. This will fail unless the cluster is in a
EXEC databricks_workspace.compute.clusters.resize
@deployment_name='{{ deployment_name }}' --required
@@json=
'{
"num_workers": "{{ num_workers }}",
"cluster_id": "{{ cluster_id }}",
"autoscale": "{{ autoscale }}"
}';
Restarts a Spark cluster with the supplied ID. If the cluster is not currently in a
EXEC databricks_workspace.compute.clusters.restart
@deployment_name='{{ deployment_name }}' --required
@@json=
'{
"cluster_id": "{{ cluster_id }}",
"restart_user": "{{ restart_user }}"
}';
Starts a terminated Spark cluster with the supplied ID. This works similar to
EXEC databricks_workspace.compute.clusters.start
@deployment_name='{{ deployment_name }}' --required
@@json=
'{
"cluster_id": "{{ cluster_id }}"
}';
Unpinning a cluster will allow the cluster to eventually be removed from the ListClusters API. Unpinning a cluster that is not pinned will have no effect. This API can only be called by workspace admins.
EXEC databricks_workspace.compute.clusters.unpin
@deployment_name='{{ deployment_name }}' --required
@@json=
'{
"cluster_id": "{{ cluster_id }}"
}';