Implementing a Lakehouse with Microsoft Fabric (DP-601)
The primary audience for this course is data professionals who are familiar with data modeling, extraction, and analytics. It is designed for professionals who are interested in gaining knowledge about Lakehouse architecture, the Microsoft Fabric platform, and how to enable end-to-end analytics using these technologies.
Description
Students will learn:
- Introduction to end-to-end analytics using Microsoft Fabric
- Get started with lakehouses in Microsoft Fabric
- Use Apache Spark in Microsoft Fabric
- Work with Delta Lake tables in Microsoft Fabric
- Ingest Data with Dataflows Gen2 in Microsoft Fabric
- Use Data Factory pipelines in Microsoft Fabric
- Organize a Fabric lakehouse using medallion architecture design
Course Outline
Module 1: Introduction to end-to-end analytics using Microsoft Fabric
- Describe end-to-end analytics in Microsoft Fabric
Module 2: Get started with lakehouses in Microsoft Fabric
- Describe core features and capabilities of lakehouses in Microsoft Fabric
- Create a lakehouse
- Ingest data into files and tables in a lakehouse
- Query lakehouse tables with SQL
Module 3: Use Apache Spark in Microsoft Fabric
- Configure Spark in a Microsoft Fabric workspace
- Identify suitable scenarios for Spark notebooks and Spark jobs
- Use Spark dataframes to analyze and transform data
- Use Spark SQL to query data in tables and views
- Visualize data in a Spark notebook
Module 4: Work with Delta Lake tables in Microsoft Fabric
- Understand Delta Lake and delta tables in Microsoft Fabric
- Create and manage delta tables using Spark
- Use Spark to query and transform data in delta tables
- Use delta tables with Spark structured streaming
Module 5: Ingest Data with Dataflows Gen2 in Microsoft Fabric
- Describe Dataflow capabilities in Microsoft Fabric
- Create Dataflow solutions to ingest and transform data
- Include a Dataflow in a pipeline
Module 6: Use Data Factory pipelines in Microsoft Fabric
- Describe pipeline capabilities in Microsoft Fabric
- Use the Copy Data activity in a pipeline
- Create pipelines based on predefined templates
- Run and monitor pipelines
Module 7: Organize a Fabric lakehouse using medallion architecture design
- Describe the principles of using the medallion architecture in data management.
- Apply the medallion architecture framework within the Microsoft Fabric environment.
- Analyze data stored in the lakehouse using DirectLake in Power BI.
- Describe best practices for ensuring the security and governance of data stored in the medallion architecture.
Prerequisites
You should be familiar with basic data concepts and terminology.