book

Fundamentals of Microsoft Fabric

by Nikola Ilic, Ben Weissman

June 2025

Beginner to intermediate

446 pages

10h 46m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Includes

Quizzes

Foreword
Preface
Who Should Read This BookNavigating This BookConventions Used in This BookO’Reilly Online LearningHow to Contact UsAcknowledgments
I. The Foundation of Fabric
1. What Is Microsoft Fabric?
The Why and the What of Microsoft FabricThe Big PictureWorkspaces and DomainsOneLakeData FactoryData EngineeringData WarehouseData ScienceReal-Time IntelligencePower BIDatabasesIndustry SolutionsCopilotsThe Fabric Pricing ModelSummary
2. Getting Started with Microsoft Fabric
Creating an Azure AccountEnabling FabricFirst Steps with FabricCreating a WorkspaceBuilding a LakehouseBuilding a WarehouseVisualizing Fabric Data in Power BISummary
3. All Roads Lead to OneLake
Overview of Data LakesEvolution of Data Storage SolutionsThe Importance of Data LakesIntroduction to OneLakeSeparation of Compute and StorageFile ExplorerWhat Makes OneLake UniqueThe Foundation of OneLakeDelta and Iceberg FormatsInteroperabilityScalability and PerformanceData Stored in OneLakeOrganizing Data in OneLakeDomainsWorkspacesKey Differences Between Domains and WorkspacesIngesting Data to and Integrating Data with OneLakeMethods of Data IngestionIntegration MechanismsOneLake CatalogOneLake ExplorerSummary
II. The Features: A Deep Dive
4. Data Factory
PipelinesA Step-by-Step Guide to a Data PipelineMoving and Transforming DataComparing Data Movement FunctionsExtending Data OrchestrationSchedules and TriggersApache AirflowSummary
5. Data Engineering
Fundamentals of a LakehouseLakehouses Versus Data LakesThe Medallion ArchitectureLakehouses in Microsoft FabricLakehouse SchemasIntegrating Data into a LakehouseQuerying and Working with Data in Your LakehouseLeveraging Spark for Data EngineeringA Step-by-Step Example: Building an ETL Pipeline in a NotebookSummary
6. Data Warehousing
Fundamentals of Data WarehousingWarehouses Versus LakehousesWarehouses in Microsoft FabricIntegrating Data into a WarehouseQuerying a WarehouseElements of a Fabric Data WarehouseData Warehouses Versus Traditional SQL EnginesT-SQL LimitationsSummary

7. Data Science in Microsoft Fabric
MLflowExperimentation Tracking for Sales ForecastingDeploying Models as REST APIs to Support Nontechnical TeamsManaging Model VersionsSynapseMLAutoMLSemantic LinkVisualize Dependencies in the Semantic ModelOptimize Semantic Models with Best Practice Analyzer RulesTranslate Semantic ModelsMigrate Existing Semantic Models to Direct LakeAugment the Gold LayerSummary
8. Real-Time Intelligence
What Is Stream Processing?Real-Time HubEventstreamsEventhouse and KQL DatabaseEventhouseKQL DatabaseQuery and Visualize Data in Real-Time IntelligenceKQL QuerysetReal-Time DashboardsVisualize Data with Power BIActivatorCore Activator ConceptsUnderstanding the Activator ItemWorking with Power BI DataWorking with Real-Time Hub DataBeyond Basic ScenariosTrigger Fabric ItemsCreate Custom Actions to Trigger Power Automate FlowsSummary
9. Power BI
Power BI Workloads in the Pre-Fabric EraImport Mode for Blazing Fast PerformanceDirectQuery Mode for Real-Time ReportingPower BI Workloads in Microsoft FabricUnderstanding Direct Lake ModePrerequisitesTwo “Flavors” of Direct LakeDefault Versus Custom Semantic ModelSyncing the Semantic Model with OneLakeDirect Lake Key ConceptsHow Does Direct Lake Work?Direct Lake Semantic Model Refresh (aka Framing)Transcoding (Loading Columns into Memory)TemperatureDirect Lake GuardrailsControl DirectLakeBehavior for Direct Lake on SQL Semantic ModelsDirect Lake LimitationsSummary
10. SQL Databases
Why SQL Databases in Fabric?The Role of AIOperational EfficiencyKey Features of SQL DatabasesSimplicity and Autonomous OperationAI Integration and OptimizationIntegrated Governance and SecurityDevOps IntegrationUnified Data Storage with OneLakeGraphQL InterfaceIngesting and Querying DataA Step-by-Step Guide to Building and Managing SQL DatabasesSummary
11. Mirroring
What Is Mirroring?Mirroring RequirementsEnabling Mirroring in Your TenantNetworkingSource Data LimitationsA Step-by-Step Guide to Mirroring from Azure SQL DBSystem Assigned Managed Identity (SAMI)Grant Access for Fabric Through a Database PrincipalCreate a Mirrored Azure SQL DatabaseFabric Link Is Not the Same ThingSummary
12. Microsoft Fabric API for GraphQL
Core GraphQL OperationsWorking with GraphQL in FabricQuery Data with API for GraphQLCreating RelationshipsMaking Changes Using MutationsGoing Above and Beyond with VariablesSummary
13. AI and Copilots
What Is Copilot?Enable Copilot in Microsoft FabricCopilot for Data FactoryCopilot for Data Engineering and Data ScienceCopilot for Data WarehouseCopilot for Power BIPrepare Semantic Model for CopilotCreate Reports in the Power BI Service or Power BI DesktopSummarize Report Content in the Copilot PaneWrite DAX with CopilotCopilot for Real-Time IntelligenceCopilot for SQL DatabaseAI Services in Microsoft FabricData Agent in Microsoft FabricFabric Data Agent Versus CopilotWorking with the Fabric Data AgentSummary
III. Putting Fabric into Production
14. The Fabric Pricing Model
Compute and CapacitiesCapacity TypesCapacity SizesWhat Exactly Is a Capacity Unit (CU)?Capacity Bursting (and Smoothing)Capacity LimitationsStorageUser LicensesNetworkingRegional DifferencesAdditional PricingSummary
15. Administering and Monitoring Microsoft Fabric
Data Governance with Microsoft FabricAdministering Microsoft FabricHierarchical Structure of Microsoft FabricWorking with the Admin PortalMonitoring Microsoft FabricMonitor HubCapacity Metrics AppMicrosoft Purview HubAdmin Monitoring WorkspaceSummary
16. Securing Microsoft Fabric
Secure Data Access in Microsoft FabricWorkspace-Level Access ControlItem-Level Access ControlRow-Level SecurityObject-Level and Column-Level SecurityFolder-Level Access ControlShortcuts Security ModelCommon Security ScenariosData Discovery and TrustOneLake CatalogEndorsementTagsSensitivity LabelsSummary
17. CI/CD in Microsoft Fabric
CI/CD Workflow OptionsGit IntegrationDeployment PipelinesRecommended Practices for Lifecycle ManagementAutomating CI/CD Workflows with Fabric REST APIsSummary
18. Fabric Decision Guide: When to Choose What
How to Pick the Right OptionChoosing an Analytical EngineData VolumeSupported Data TypesSupported Programming LanguagesSupported Data Ingestion and Data Access MethodsAccess ControlOneLake InteroperabilityScenario-Based Decision GuideMirrored Azure SQL Database Versus SQL DatabaseScenario 1: Web Application with Operational DataScenario 2: Big Data Containing Sensitive InformationSQL Database in Fabric Versus Fabric WarehouseScenario 1: Aggregating Big Data for Analytical ReportsScenario 2: Near Real-Time Operational Reporting with Enforced DatabaseEnforced Database ConstraintsDirect Lake Versus Import Mode for Semantic ModelsScenario 1: Self-Service BI with Power QueryScenario 2: Near Real Time Reporting RequirementScenario 3: Resource-Consuming Data Refresh ProcessScenario 4: Using DAX Calculated Tables/ColumnsScenario 5: Using T-SQL ViewsScenario 6: RLS/OLS Enforced in the Warehouse/SQL Analytics Endpoint of the LakehouseAll Roads Lead to OneLake—but Which One Is the Right Road?Dataflow Versus Notebook Versus Pipeline Versus Mirroring Versus ShortcutScenario 1: Ingesting Data As Is from On-Premises Data SourceScenario 2: Customizing the Data-Writing ProcessScenario 3: Transforming Data for the Serving LayerTo V-Order or Not to V-Order?Summary
Index
About the Authors

Content preview from Fundamentals of Microsoft Fabric

Chapter 3. All Roads Lead to OneLake

One of Fabric’s key characteristics is that it is lake-centric. All of its data is stored in a data lake—OneLake. This chapter will walk you through the basics of data lakes and the specifics of OneLake.

Overview of Data Lakes

A data lake is a centralized repository that allows for the storage of structured, semi-structured, and unstructured data at any scale. Unlike traditional data warehouses that store data in predefined schemas, data lakes are designed to hold vast amounts of raw data in its native format until it is needed. This flexibility supports diverse data types including text, images, videos, and social media streams, making data lakes an integral part of modern big data architectures.

The primary purpose of a data lake is to provide a scalable and cost-effective solution for storing large volumes of data. This data can be processed and analyzed to extract valuable insights, facilitate real-time analytics, and support data science and machine learning applications. The structure of a data lake allows businesses to store all their data in one place, enabling comprehensive analysis and integration across different data sources.

Evolution of Data Storage Solutions

Data storage solutions have evolved significantly over the years, reflecting the growing complexity and scale of data management needs.

Figure 3-1 shows how data storage solutions have developed over time.

Figure 3-1. The evolution of data storage systems

The journey ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

The Definitive Guide to Microsoft Fabric

Publisher Resources

ISBN: 9781098172916Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business