Skip to main content

serving_endpoints

Creates, updates, deletes, gets or lists a serving_endpoints resource.

Overview

Nameserving_endpoints
TypeResource
Iddatabricks_workspace.serving.serving_endpoints

Fields

The following fields are returned by SELECT queries:

NameDatatypeDescription
idstringSystem-generated ID of the endpoint. This is used to refer to the endpoint in the Permissions API
namestringThe name of the serving endpoint.
budget_policy_idstringThe budget policy associated with the endpoint.
ai_gatewayobject
configobjectThe config that is currently being served by the endpoint.
creation_timestampintegerThe timestamp when the endpoint was created in Unix time.
creatorstringThe email of the user who created the serving endpoint.
data_plane_infoobjectInformation required to query DataPlane APIs.
descriptionstringDescription of the serving model
email_notificationsobjectEmail notification settings.
endpoint_urlstringEndpoint invocation url if route optimization is enabled for endpoint
last_updated_timestampintegerThe timestamp when the endpoint was last updated by a user in Unix time.
pending_configobjectThe config that the endpoint is attempting to update to.
permission_levelstringThe permission level of the principal making the request. (CAN_MANAGE, CAN_QUERY, CAN_VIEW)
route_optimizedbooleanBoolean representing if route optimization has been enabled for the endpoint
stateobjectInformation corresponding to the state of the serving endpoint.
tagsarrayTags attached to the serving endpoint.
taskstringThe task type of the serving endpoint.

Methods

The following methods are available for this resource:

NameAccessible byRequired ParamsOptional ParamsDescription
getselectname, deployment_nameRetrieves the details for a single serving endpoint.
listselectdeployment_nameGet all serving endpoints.
createinsertdeployment_name, nameCreate a new serving endpoint.
updateupdatename, deployment_nameUsed to batch add and delete tags from a serving endpoint with a single API call.
update_configreplacename, deployment_nameUpdates any combination of the serving endpoint's served entities, the compute configuration of those
deletedeletename, deployment_nameDelete a serving endpoint.
queryexecname, deployment_nameQuery a serving endpoint

Parameters

Parameters can be passed in the WHERE clause of a query. Check the Methods section to see which parameters are required or optional for each operation.

NameDatatypeDescription
deployment_namestringThe Databricks Workspace Deployment Name (default: dbc-abcd0123-a1bc)
namestringThe name of the serving endpoint. This field is required and is provided via the path parameter.

SELECT examples

Retrieves the details for a single serving endpoint.

SELECT
id,
name,
budget_policy_id,
ai_gateway,
config,
creation_timestamp,
creator,
data_plane_info,
description,
email_notifications,
endpoint_url,
last_updated_timestamp,
pending_config,
permission_level,
route_optimized,
state,
tags,
task
FROM databricks_workspace.serving.serving_endpoints
WHERE name = '{{ name }}' -- required
AND deployment_name = '{{ deployment_name }}' -- required
;

INSERT examples

Create a new serving endpoint.

INSERT INTO databricks_workspace.serving.serving_endpoints (
name,
ai_gateway,
budget_policy_id,
config,
description,
email_notifications,
rate_limits,
route_optimized,
tags,
deployment_name
)
SELECT
'{{ name }}' /* required */,
'{{ ai_gateway }}',
'{{ budget_policy_id }}',
'{{ config }}',
'{{ description }}',
'{{ email_notifications }}',
'{{ rate_limits }}',
{{ route_optimized }},
'{{ tags }}',
'{{ deployment_name }}'
RETURNING
id,
name,
budget_policy_id,
ai_gateway,
config,
creation_timestamp,
creator,
data_plane_info,
description,
email_notifications,
endpoint_url,
last_updated_timestamp,
pending_config,
permission_level,
route_optimized,
state,
tags,
task
;

UPDATE examples

Used to batch add and delete tags from a serving endpoint with a single API call.

UPDATE databricks_workspace.serving.serving_endpoints
SET
add_tags = '{{ add_tags }}',
delete_tags = '{{ delete_tags }}'
WHERE
name = '{{ name }}' --required
AND deployment_name = '{{ deployment_name }}' --required
RETURNING
tags;

REPLACE examples

Updates any combination of the serving endpoint's served entities, the compute configuration of those

REPLACE databricks_workspace.serving.serving_endpoints
SET
auto_capture_config = '{{ auto_capture_config }}',
served_entities = '{{ served_entities }}',
served_models = '{{ served_models }}',
traffic_config = '{{ traffic_config }}'
WHERE
name = '{{ name }}' --required
AND deployment_name = '{{ deployment_name }}' --required
RETURNING
id,
name,
budget_policy_id,
ai_gateway,
config,
creation_timestamp,
creator,
data_plane_info,
description,
email_notifications,
endpoint_url,
last_updated_timestamp,
pending_config,
permission_level,
route_optimized,
state,
tags,
task;

DELETE examples

Delete a serving endpoint.

DELETE FROM databricks_workspace.serving.serving_endpoints
WHERE name = '{{ name }}' --required
AND deployment_name = '{{ deployment_name }}' --required
;

Lifecycle Methods

Query a serving endpoint

EXEC databricks_workspace.serving.serving_endpoints.query 
@name='{{ name }}' --required,
@deployment_name='{{ deployment_name }}' --required
@@json=
'{
"client_request_id": "{{ client_request_id }}",
"dataframe_records": "{{ dataframe_records }}",
"dataframe_split": "{{ dataframe_split }}",
"extra_params": "{{ extra_params }}",
"input": "{{ input }}",
"inputs": "{{ inputs }}",
"instances": "{{ instances }}",
"max_tokens": {{ max_tokens }},
"messages": "{{ messages }}",
"n": {{ n }},
"prompt": "{{ prompt }}",
"stop": "{{ stop }}",
"stream": {{ stream }},
"temperature": {{ temperature }},
"usage_context": "{{ usage_context }}"
}'
;