Skip to main content

instance_pools

Creates, updates, deletes, gets or lists an instance_pools resource.

Overview

Nameinstance_pools
TypeResource
Iddatabricks_workspace.compute.instance_pools

Fields

The following fields are returned by SELECT queries:

NameDatatypeDescription
instance_pool_idstring
node_type_idstringThis field encodes, through a single value, the resources available to each of the Spark nodes in this cluster. For example, the Spark nodes can be provisioned and optimized for memory or compute intensive workloads. A list of available node types can be retrieved by using the :method:clusters/listNodeTypes API call.
instance_pool_namestringPool name requested by the user. Pool name must be unique. Length must be between 1 and 100 characters.
aws_attributesobjectAttributes related to instance pools running on Amazon Web Services. If not specified at pool creation, a set of default values will be used.
azure_attributesobjectAttributes related to instance pools running on Azure. If not specified at pool creation, a set of default values will be used.
custom_tagsobjectAdditional tags for pool resources. Databricks will tag all pool resources (e.g., AWS instances and EBS volumes) with these tags in addition to `default_tags`. Notes: - Currently, Databricks allows at most 45 custom tags
default_tagsobjectTags that are added by Databricks regardless of any ``custom_tags``, including: - Vendor: Databricks - InstancePoolCreator: <user_id_of_creator> - InstancePoolName: <name_of_pool> - InstancePoolId: <id_of_pool>
disk_specobjectDefines the specification of the disks that will be attached to all spark containers.
enable_elastic_diskbooleanAutoscaling Local Storage: when enabled, this instances in this pool will dynamically acquire additional disk space when its Spark workers are running low on disk space. In AWS, this feature requires specific AWS permissions to function correctly - refer to the User Guide for more details.
gcp_attributesobjectAttributes related to instance pools running on Google Cloud Platform. If not specified at pool creation, a set of default values will be used.
idle_instance_autotermination_minutesintegerAutomatically terminates the extra instances in the pool cache after they are inactive for this time in minutes if min_idle_instances requirement is already met. If not set, the extra pool instances will be automatically terminated after a default timeout. If specified, the threshold must be between 0 and 10000 minutes. Users can also set this value to 0 to instantly remove idle instances from the cache if min cache size could still hold.
max_capacityintegerMaximum number of outstanding instances to keep in the pool, including both instances used by clusters and idle instances. Clusters that require further instance provisioning will fail during upsize requests.
min_idle_instancesintegerMinimum number of idle instances to keep in the instance pool
node_type_flexibilityobjectFlexible node type configuration for the pool.
preloaded_docker_imagesarrayCustom Docker Image BYOC
preloaded_spark_versionsarrayA list containing at most one preloaded Spark image version for the pool. Pool-backed clusters started with the preloaded Spark version will start faster. A list of available Spark versions can be retrieved by using the :method:clusters/sparkVersions API call.
remote_disk_throughputintegerIf set, what the configurable throughput (in Mb/s) for the remote disk is. Currently only supported for GCP HYPERDISK_BALANCED types.
statestringCurrent state of the instance pool. (ACTIVE, DELETED, STOPPED)
statsobjectUsage statistics about the instance pool.
statusobjectStatus of failed pending instances in the pool.
total_initial_remote_disk_sizeintegerIf set, what the total initial volume size (in GB) of the remote disks should be. Currently only supported for GCP HYPERDISK_BALANCED types.

Methods

The following methods are available for this resource:

NameAccessible byRequired ParamsOptional ParamsDescription
getselectinstance_pool_id, deployment_nameRetrieve the information for an instance pool based on its identifier.
listselectdeployment_nameGets a list of instance pools with their statistics.
createinsertdeployment_name, instance_pool_name, node_type_idCreates a new instance pool using idle and ready-to-use cloud instances.
replacereplacedeployment_name, instance_pool_id, instance_pool_name, node_type_idModifies the configuration of an existing instance pool.
deletedeletedeployment_nameDeletes the instance pool permanently. The idle instances in the pool are terminated asynchronously.

Parameters

Parameters can be passed in the WHERE clause of a query. Check the Methods section to see which parameters are required or optional for each operation.

NameDatatypeDescription
deployment_namestringThe Databricks Workspace Deployment Name (default: dbc-abcd0123-a1bc)
instance_pool_idstringThe canonical unique identifier for the instance pool.

SELECT examples

Retrieve the information for an instance pool based on its identifier.

SELECT
instance_pool_id,
node_type_id,
instance_pool_name,
aws_attributes,
azure_attributes,
custom_tags,
default_tags,
disk_spec,
enable_elastic_disk,
gcp_attributes,
idle_instance_autotermination_minutes,
max_capacity,
min_idle_instances,
node_type_flexibility,
preloaded_docker_images,
preloaded_spark_versions,
remote_disk_throughput,
state,
stats,
status,
total_initial_remote_disk_size
FROM databricks_workspace.compute.instance_pools
WHERE instance_pool_id = '{{ instance_pool_id }}' -- required
AND deployment_name = '{{ deployment_name }}' -- required
;

INSERT examples

Creates a new instance pool using idle and ready-to-use cloud instances.

INSERT INTO databricks_workspace.compute.instance_pools (
instance_pool_name,
node_type_id,
aws_attributes,
azure_attributes,
custom_tags,
disk_spec,
enable_elastic_disk,
gcp_attributes,
idle_instance_autotermination_minutes,
max_capacity,
min_idle_instances,
node_type_flexibility,
preloaded_docker_images,
preloaded_spark_versions,
remote_disk_throughput,
total_initial_remote_disk_size,
deployment_name
)
SELECT
'{{ instance_pool_name }}' /* required */,
'{{ node_type_id }}' /* required */,
'{{ aws_attributes }}',
'{{ azure_attributes }}',
'{{ custom_tags }}',
'{{ disk_spec }}',
{{ enable_elastic_disk }},
'{{ gcp_attributes }}',
{{ idle_instance_autotermination_minutes }},
{{ max_capacity }},
{{ min_idle_instances }},
'{{ node_type_flexibility }}',
'{{ preloaded_docker_images }}',
'{{ preloaded_spark_versions }}',
{{ remote_disk_throughput }},
{{ total_initial_remote_disk_size }},
'{{ deployment_name }}'
RETURNING
instance_pool_id
;

REPLACE examples

Modifies the configuration of an existing instance pool.

REPLACE databricks_workspace.compute.instance_pools
SET
instance_pool_id = '{{ instance_pool_id }}',
instance_pool_name = '{{ instance_pool_name }}',
node_type_id = '{{ node_type_id }}',
custom_tags = '{{ custom_tags }}',
idle_instance_autotermination_minutes = {{ idle_instance_autotermination_minutes }},
max_capacity = {{ max_capacity }},
min_idle_instances = {{ min_idle_instances }},
remote_disk_throughput = {{ remote_disk_throughput }},
total_initial_remote_disk_size = {{ total_initial_remote_disk_size }}
WHERE
deployment_name = '{{ deployment_name }}' --required
AND instance_pool_id = '{{ instance_pool_id }}' --required
AND instance_pool_name = '{{ instance_pool_name }}' --required
AND node_type_id = '{{ node_type_id }}' --required;

DELETE examples

Deletes the instance pool permanently. The idle instances in the pool are terminated asynchronously.

DELETE FROM databricks_workspace.compute.instance_pools
WHERE deployment_name = '{{ deployment_name }}' --required
;