Working in the Databricks Workspace
Objectives
- Execute code in multiple languages via magic commands
- Create documentation cells
- Access DBFS (Databricks File System) via
dbutils
and %fs commands - Create database and table and using python variable in SQL
- Query table and plot results
- Add notebook parameters with widgets
- Attach the notebook to a cluster/compute.
- Default language for the notebook can be selected.
- The notebook can be run in different languages using magic commands.
Databricks notebook utilities:
- Magic commands:
%python
,%sql
,%r
,%scala
,%md
,%fs
- DBUtils:
dbutils.fs
(%fs
),dbutils.notebook
(%run
),dbutils.widgets
- Visualizations:
display()
,displayHTML()
Access DBFS (Databricks File System)
-
DBFS is a distributed file system that is accessible on Databricks clusters. It is an abstraction over cloud storage (e.g., AWS S3, Azure Blob Storage, GCS) and allows you to access files and directories in a unified way.
-
You can use the
%fs
magic command ordbutils.fs
to interact with DBFS. -
Mounting external storage (e.g., AWS S3, Azure Blob Storage) to DBFS allows you to access data stored in these external systems as if it were part of the Databricks file system.
-
Mounting is done using the
dbutils.fs.mount
command, which requires the source and mount point as parameters. -
%fs
is a shorthand fordbutils.fs
and allows you to run file system commands directly in the notebook. -
%fs ls
ordbutils.fs.ls
- List files and directories in DBFS. -
%fs rm
- Remove files and directories in DBFS. -
%fs head /datasets/readme.txt
- Display the first few lines of a file in DBFS. -
%fs mounts
- List all mounted directories in DBFS. -
%fs cp
- Copy files and directories in DBFS. -
%fs mv
- Move files and directories in DBFS. -
%fs help
- Display help information for DBFS commands.
Add notebook parameters with widgets
- Widgets are UI elements that allow users to input parameters into notebooks. They can be used to create interactive notebooks that accept user input and modify the behavior of the notebook based on that input.
- Widgets can be created using the
dbutils.widgets
module, which provides methods to create different types of widgets (e.g., text, dropdown, combobox, multiselect). - Widgets can be used to pass parameters to notebooks, allowing users to customize the behavior of the notebook without modifying the code.
- Widgets can be accessed using the
dbutils.widgets.get
method, which retrieves the value of the widget based on its name. - Widgets can be removed using the
dbutils.widgets.remove
method, which removes the widget from the notebook. - Widgets can be cleared using the
dbutils.widgets.removeAll
method, which removes all widgets from the notebook. - Widgets can be displayed using the
display
function, which renders the widget in the notebook.
Create a text input widget using SQL and access the current value of the widget using getArgument()
function.
Remove widget