dbfs
Creates, updates, deletes, gets or lists a dbfs resource.
Overview
| Name | dbfs |
| Type | Resource |
| Id | databricks_workspace.files.dbfs |
Fields
The following fields are returned by SELECT queries:
- list
| Name | Datatype | Description |
|---|---|---|
file_size | integer | |
is_dir | boolean | True if the path is a directory. |
modification_time | integer | Last modification time of given file in milliseconds since epoch. |
path | string | The absolute path of the file or directory. |
Methods
The following methods are available for this resource:
| Name | Accessible by | Required Params | Optional Params | Description |
|---|---|---|---|---|
list | select | path, deployment_name | List the contents of a directory, or details of the file. If the file or directory does not exist, | |
create | insert | deployment_name, path | Opens a stream to write to a file and returns a handle to this stream. There is a 10 minute idle | |
delete | delete | deployment_name | Delete the file or directory (optionally recursively delete all files in the directory). This call | |
add_block | exec | deployment_name, handle, data | Appends a block of data to the stream specified by the input handle. If the handle does not exist, | |
close | exec | deployment_name, handle | Closes the stream specified by the input handle. If the handle does not exist, this call throws an | |
get_status | exec | path, deployment_name | Gets the file information for a file or directory. If the file or directory does not exist, this call | |
mkdirs | exec | deployment_name, path | Creates the given directory and necessary parent directories if they do not exist. If a file (not a | |
move | exec | deployment_name, source_path, destination_path | Moves a file from one location to another location within DBFS. If the source file does not exist, | |
put | exec | deployment_name, path | Uploads a file through the use of multipart form post. It is mainly used for streaming uploads, but | |
read | exec | path, deployment_name | length, offset | Returns the contents of a file. If the file does not exist, this call throws an exception with |
Parameters
Parameters can be passed in the WHERE clause of a query. Check the Methods section to see which parameters are required or optional for each operation.
| Name | Datatype | Description |
|---|---|---|
deployment_name | string | The Databricks Workspace Deployment Name (default: dbc-abcd0123-a1bc) |
path | string | The path of the file to read. The path should be the absolute DBFS path. |
length | integer | The number of bytes to read starting from the offset. This has a limit of 1 MB, and a default value of 0.5 MB. |
offset | integer | The offset to read from in bytes. |
SELECT examples
- list
List the contents of a directory, or details of the file. If the file or directory does not exist,
SELECT
file_size,
is_dir,
modification_time,
path
FROM databricks_workspace.files.dbfs
WHERE path = '{{ path }}' -- required
AND deployment_name = '{{ deployment_name }}' -- required
;
INSERT examples
- create
- Manifest
Opens a stream to write to a file and returns a handle to this stream. There is a 10 minute idle
INSERT INTO databricks_workspace.files.dbfs (
path,
overwrite,
deployment_name
)
SELECT
'{{ path }}' /* required */,
{{ overwrite }},
'{{ deployment_name }}'
RETURNING
handle
;
# Description fields are for documentation purposes
- name: dbfs
props:
- name: deployment_name
value: "{{ deployment_name }}"
description: Required parameter for the dbfs resource.
- name: path
value: "{{ path }}"
description: |
The path of the new file. The path should be the absolute DBFS path.
- name: overwrite
value: {{ overwrite }}
description: |
The flag that specifies whether to overwrite existing file/files.
DELETE examples
- delete
Delete the file or directory (optionally recursively delete all files in the directory). This call
DELETE FROM databricks_workspace.files.dbfs
WHERE deployment_name = '{{ deployment_name }}' --required
;
Lifecycle Methods
- add_block
- close
- get_status
- mkdirs
- move
- put
- read
Appends a block of data to the stream specified by the input handle. If the handle does not exist,
EXEC databricks_workspace.files.dbfs.add_block
@deployment_name='{{ deployment_name }}' --required
@@json=
'{
"handle": {{ handle }},
"data": "{{ data }}"
}'
;
Closes the stream specified by the input handle. If the handle does not exist, this call throws an
EXEC databricks_workspace.files.dbfs.close
@deployment_name='{{ deployment_name }}' --required
@@json=
'{
"handle": {{ handle }}
}'
;
Gets the file information for a file or directory. If the file or directory does not exist, this call
EXEC databricks_workspace.files.dbfs.get_status
@path='{{ path }}' --required,
@deployment_name='{{ deployment_name }}' --required
;
Creates the given directory and necessary parent directories if they do not exist. If a file (not a
EXEC databricks_workspace.files.dbfs.mkdirs
@deployment_name='{{ deployment_name }}' --required
@@json=
'{
"path": "{{ path }}"
}'
;
Moves a file from one location to another location within DBFS. If the source file does not exist,
EXEC databricks_workspace.files.dbfs.move
@deployment_name='{{ deployment_name }}' --required
@@json=
'{
"source_path": "{{ source_path }}",
"destination_path": "{{ destination_path }}"
}'
;
Uploads a file through the use of multipart form post. It is mainly used for streaming uploads, but
EXEC databricks_workspace.files.dbfs.put
@deployment_name='{{ deployment_name }}' --required
@@json=
'{
"path": "{{ path }}",
"contents": "{{ contents }}",
"overwrite": {{ overwrite }}
}'
;
Returns the contents of a file. If the file does not exist, this call throws an exception with
EXEC databricks_workspace.files.dbfs.read
@path='{{ path }}' --required,
@deployment_name='{{ deployment_name }}' --required,
@length='{{ length }}',
@offset='{{ offset }}'
;