Skip to main content

serving_endpoints

Creates, updates, deletes, gets or lists a serving_endpoints resource.

Overview

Nameserving_endpoints
TypeResource
Iddatabricks_workspace.realtimeserving.serving_endpoints

Fields

The following fields are returned by SELECT queries:

Serving endpoint was retrieved successfully.

NameDatatypeDescription
idstring
namestring
configobject
creation_timestampinteger
creatorstring
last_updated_timestampinteger
permission_levelstring
route_optimizedboolean
stateobject
tagsarray

Methods

The following methods are available for this resource:

NameAccessible byRequired ParamsOptional ParamsDescription
getselectdeployment_nameRetrieves the details for a single serving endpoint.
listselectdeployment_name
createinsertdeployment_name
updateconfigupdatedeployment_nameUpdates any combination of the serving endpoint's served entities, the compute configuration of those served entities, and the endpoint's traffic config. An endpoint that already has an update in progress can not be updated until the current update completes or fails.
patchupdatedeployment_nameUsed to batch add and delete tags from a serving endpoint with a single API call.
putreplacedeployment_nameUsed to update the rate limits of a serving endpoint. NOTE: Only foundation model endpoints are currently supported. For external models, use AI Gateway to manage rate limits.
deletedeletedeployment_name
queryexecdeployment_name

Parameters

Parameters can be passed in the WHERE clause of a query. Check the Methods section to see which parameters are required or optional for each operation.

NameDatatypeDescription
deployment_namestringThe Databricks Workspace Deployment Name (default: dbc-abcd0123-a1bc)

SELECT examples

Retrieves the details for a single serving endpoint.

SELECT
id,
name,
config,
creation_timestamp,
creator,
last_updated_timestamp,
permission_level,
route_optimized,
state,
tags
FROM databricks_workspace.realtimeserving.serving_endpoints
WHERE deployment_name = '{{ deployment_name }}' -- required;

INSERT examples

No description available.

INSERT INTO databricks_workspace.realtimeserving.serving_endpoints (
data__name,
data__route_optimized,
data__config,
data__tags,
data__rate_limits,
data__ai_gateway,
deployment_name
)
SELECT
'{{ name }}',
'{{ route_optimized }}',
'{{ config }}',
'{{ tags }}',
'{{ rate_limits }}',
'{{ ai_gateway }}',
'{{ deployment_name }}'
RETURNING
id,
name,
ai_gateway,
config,
creation_timestamp,
creator,
last_updated_timestamp,
permission_level,
route_optimized,
state,
tags
;

UPDATE examples

Updates any combination of the serving endpoint's served entities, the compute configuration of those served entities, and the endpoint's traffic config. An endpoint that already has an update in progress can not be updated until the current update completes or fails.

UPDATE databricks_workspace.realtimeserving.serving_endpoints
SET
data__served_entities = '{{ served_entities }}',
data__served_models = '{{ served_models }}',
data__traffic_config = '{{ traffic_config }}',
data__auto_capture_config = '{{ auto_capture_config }}'
WHERE
deployment_name = '{{ deployment_name }}' --required
RETURNING
id,
name,
config,
creation_timestamp,
creator,
last_updated_timestamp,
permission_level,
route_optimized,
state;

REPLACE examples

Used to update the rate limits of a serving endpoint. NOTE: Only foundation model endpoints are currently supported. For external models, use AI Gateway to manage rate limits.

REPLACE databricks_workspace.realtimeserving.serving_endpoints
SET
data__rate_limits = '{{ rate_limits }}'
WHERE
deployment_name = '{{ deployment_name }}' --required
RETURNING
rate_limits;

DELETE examples

No description available.

DELETE FROM databricks_workspace.realtimeserving.serving_endpoints
WHERE deployment_name = '{{ deployment_name }}' --required;

Lifecycle Methods

Serving endpoint was queried successfully and returned predictions.

EXEC databricks_workspace.realtimeserving.serving_endpoints.query 
@deployment_name='{{ deployment_name }}' --required
@@json=
'{
"prompt": "{{ prompt }}",
"input": "{{ input }}",
"temperature": "{{ temperature }}",
"stop": "{{ stop }}",
"max_tokens": {{ max_tokens }},
"n": {{ n }},
"stream": {{ stream }},
"dataframe_records": "{{ dataframe_records }}",
"instances": "{{ instances }}",
"inputs": "{{ inputs }}",
"messages": "{{ messages }}",
"extra_params": "{{ extra_params }}",
"dataframe_split": "{{ dataframe_split }}"
}';