Glossary: Overview
ACID Properties
Atomicity, Consistency, Isolation and Durability
Availability
Availability is a critical aspect of system design, ensuring that a system remains operational and accessible to users when needed.
Data Engineering
Data Engineering is the practice of designing, building, and maintaining systems for collecting, storing, processing, and analyzing large volumes of data.
Data Lake
A data lake is a centralized repository designed to store vast amounts of data in its native, raw format.
Data Lakehouse
A data lakehouse is a unified architecture that combines the scalability and flexibility of a data lake with the reliability and queryability of a data warehouse.
Data Transformation
Data Transformation is the process of converting data from one format, structure, or type into another to make it suitable for analysis, storage, or integration.
Data Warehouse
A data warehouse is a centralized repository designed to store, manage, and analyze large volumes of structured data from various sources.
Distributed System
A distributed system is a collection of independent computers that appear to its users as a single coherent system.
ELT
ELT is a modern approach to data integration that differs from the traditional ETL process. In ELT, data is first extracted from source systems, loaded into a target system (e.g., a data lake or cloud data warehouse), and then transformed within the target system.
ETL
ETL is a process used in data integration and data warehousing to collect data from various sources, transform it into a consistent format, and load it into a target system (e.g., a data warehouse or database).
Online Analytical Processing
OLAP is a type of database system designed to analyze large volumes of historical data from multiple perspectives. It enables users to perform complex analytical queries and generate reports, often used in business intelligence (BI) and data warehousing.
Online Transaction Processing
OLTP is a type of database system designed to manage transactional applications. It focuses on processing large numbers of small, short-lived transactions in real-time, ensuring data integrity and consistency.
Operational Data Store
ODS is a database designed to integrate data from multiple sources for operational reporting and real-time decision-making.