Chapter 2. Mastering Relational Entities in Databricks

Relational entities, particularly databases, tables, and views, are essential components for organizing and managing structured data in Databricks. Understanding how these entities interact with the metastore and storage locations is crucial for efficient querying and data management. In this chapter, we will cover in detail how these entities function within the Databricks environment and understand their relationship with the underlying storage.

Understanding Relational Entities

Databases in Databricks

In Databricks, a database essentially corresponds to a schema in the Hive metastore. This means that when you create a database, you’re essentially defining a logical structure where tables, views and functions can be organized. This collection of database objects is called a schema. You have the flexibility to create a database using either the CREATE DATABASE or CREATE SCHEMA ...

Get Databricks Certified Data Engineer Associate Study Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.