Databricks for business leaders - Notes

Azure Databricks

Apache spark clusters - provides hightly scalable parallel compute for distributed data processing
Data bricks File Stystem - provides distributed shared storage for data lakes
Notebooks - provide an interactive environment for combinging code, notes, documentation and images.
Metastore - provides an relational abstration layer, enabling you to define tables based on data in files.
Delta lkae - builds on the metastore to enable common relation database capabilities (e.g., ACID compliance, DML etc)
SQL Warehouses provide relational compute end points for querying data in tables.

Databricks services users one compute (spark)
Azure databricks uses a Data lakehouse architecture to work with data.
Synapse uses two % sign for the magic command whereas databricks uses a single % sign.
Display function is specific to databricks. It is not available in Synapse.

Databricks uses optimized spark wrapper around the open source Apache Spark framework.
Medallion architecture
DBFS loads the data lake on to the compute cluster for access to the files. This can be linked to databricks SQL warehouse.
In Azure databricks, SQL warehouse is equivalent to Synapse analytics’ lake database in serverless SQL pool rather than the data warehouse in the dedicated SQl pool. Because the data is not stored in the relational storage but rather the files in DBFS file system.