Objectives

  • Execute code in multiple languages via magic commands
  • Create documentation cells
  • Access DBFS (Databricks File System) via dbutils and %fs commands
  • Create database and table and using python variable in SQL
  • Query table and plot results
  • Add notebook parameters with widgets

  • Attach the notebook to a cluster/compute.
  • Default language for the notebook can be selected.
  • The notebook can be run in different languages using magic commands.

Databricks notebook utilities:

  • Magic commands: %python, %sql, %r, %scala, %md, %fs
  • DBUtils: dbutils.fs (%fs), dbutils.notebook (%run), dbutils.widgets
  • Visualizations: display(), displayHTML()
%python
print("Hello Python!")
%sql
SELECT 'Hello SQL!'
%r
print("Hello R!")
%scala
println("Hello Scala!")
%md
## Hello Markdown!
%sh
echo "Hello Shell!"
html = "<h1 style="color:red;text-align:center">Hello HTML!</h1>"
displayHTML(html)

Access DBFS (Databricks File System)

  • DBFS is a distributed file system that is accessible on Databricks clusters. It is an abstraction over cloud storage (e.g., AWS S3, Azure Blob Storage, GCS) and allows you to access files and directories in a unified way.

  • You can use the %fs magic command or dbutils.fs to interact with DBFS.

  • Mounting external storage (e.g., AWS S3, Azure Blob Storage) to DBFS allows you to access data stored in these external systems as if it were part of the Databricks file system.

  • Mounting is done using the dbutils.fs.mount command, which requires the source and mount point as parameters.

  • %fs is a shorthand for dbutils.fs and allows you to run file system commands directly in the notebook.

  • %fs ls or dbutils.fs.ls - List files and directories in DBFS.

  • %fs rm - Remove files and directories in DBFS.

  • %fs head /datasets/readme.txt - Display the first few lines of a file in DBFS.

  • %fs mounts - List all mounted directories in DBFS.

  • %fs cp - Copy files and directories in DBFS.

  • %fs mv - Move files and directories in DBFS.

  • %fs help - Display help information for DBFS commands.

Add notebook parameters with widgets

  • Widgets are UI elements that allow users to input parameters into notebooks. They can be used to create interactive notebooks that accept user input and modify the behavior of the notebook based on that input.
  • Widgets can be created using the dbutils.widgets module, which provides methods to create different types of widgets (e.g., text, dropdown, combobox, multiselect).
  • Widgets can be used to pass parameters to notebooks, allowing users to customize the behavior of the notebook without modifying the code.
  • Widgets can be accessed using the dbutils.widgets.get method, which retrieves the value of the widget based on its name.
  • Widgets can be removed using the dbutils.widgets.remove method, which removes the widget from the notebook.
  • Widgets can be cleared using the dbutils.widgets.removeAll method, which removes all widgets from the notebook.
  • Widgets can be displayed using the display function, which renders the widget in the notebook.
dbutils.widgets.text("name", "default_value", "Enter your name")
name = dbutils.widgets.get("name")
print(f"Hello {name}!")
dbutils.widgets.dropdown("color", "red", ["red", "green", "blue"], "Select a color")
color = dbutils.widgets.get("color")
print(f"You selected {color} color!")
dbutils.widgets.combobox("fruit", "apple", ["apple", "banana", "orange"], "Select a fruit")
fruit = dbutils.widgets.get("fruit")
print(f"You selected {fruit} fruit!")
dbutils.widgets.multiselect("animal", "dog", ["dog", "cat", "bird"], "Select an animal")
animal = dbutils.widgets.get("animal")
print(f"You selected {animal} animal!")

Create a text input widget using SQL and access the current value of the widget using getArgument() function.

%sql
CREATE WIDGET TEXT "name" DEFAULT "default_value" LABEL "Enter your name"
SELECT "Hello " || getArgument("name") || "!"

Remove widget

%sql
REMOVE WIDGET name
%python
dbutils.widgets.remove("name")
dbutils.widgets.removeAll() # to remove all widgets

Source