Video description
Learn to ingest, process, and export data in Azure Data Lake Storage Service Gen1 and Gen2 using Databricks and HDInsight
About This Video
- Discover Microsoft Azure Data Lake
- Learn to use Azure Databricks and HDInsight to process data in ADLS
- Explore data lifecycle and architecture around Data Lake
In Detail
Azure Data Lake Storage Gen2 (ADLS) is a cloud-based repository for both structured and unstructured data. For example, you could use it to store everything, from documents to images to social media streams. This is one of the most effective ways to go for big data processing; that is, to store your data in ADLS and then process it using Spark, which is a faster version of Hadoop, on Azure Databricks.
This is a comprehensive hands-on course for anyone who is interested in Azure's big data analytics services. You will learn hands-on with examples to import data into ADLS and then securely access it and analyze it using Azure Databricks and Azure HDInsight. You will also learn how to monitor and optimize your Data Lake storage. This course provides an end-to-end demonstration for one to have a noticeably clear understanding of Data Lake.
By the end of this course, you will learn how to ingest, process, and export data using Databricks and HDInsight. You will have a solid understanding of Microsoft Azure Data Lake Storage Service (Gen1 and Gen2) and its features and properties, which will help you further in your professional endeavors.
Audience
This course is for anyone interested in Azure's big data analytics services. Also, Microsoft Azure data engineers, database and BI developers, database administrators, data analysts, or similar profiles can opt for this course.
Just a basic understanding of data warehouse and database, in general, will help you understand this course better.
Publisher resources
Table of contents
- Chapter 1 : Course Introduction
- Chapter 2 : Introduction to Azure Cloud Computing
- Chapter 3 : Introduction to Azure Data Lake
- Chapter 4 : Data Ingestion
- Chapter 5 : Data Flow Around Data Lake
- Chapter 6 : Azure Data Lake Processing Through Databricks
-
Chapter 7 : Azure Data Lake Processing Through HDInsight
- Demo Overview
- Create Azure Data Lake Storage Gen2 (Source) and SQL Server (Destination)
- What is Managed Identity
- Add Managed Identity to Gen2 and Database Accounts
- Create HDInsight Interactive Query Cluster
- Ambari Overview and UI
- Ingest Dataset into Data Lake Storage
- Data Extraction with Hive
- Data Transformation with Hive
- Data Export Using Sqoop
- Summary
- Chapter 8 : Security Layers in Data Lake
- Chapter 9 : Data Lake Monitoring and Optimization
- Chapter 10 : Practice Tests and Bonus
Product information
- Title: Microsoft Azure Data Lake Storage Service (Gen1 and Gen2)
- Author(s):
- Release date: February 2022
- Publisher(s): Packt Publishing
- ISBN: 9781803236407
You might also like
video
Getting Started with Databricks: Tools for Understanding Massive Data Sets
A hit-the-ground-running introduction to the expansive world of understanding your data. If you fail to understand …
video
Learning Data Modeling
In this Learning Data Modeling training course, expert author Michael Blaha will teach you how to …
video
Amazon Web Services (AWS), 3rd Edition
18+ Hours of Video Instruction Get intensive, hands-on AWS training with Chad Smith in this 2 …
video
Writing and Running Azure Data Lake Analytics Jobs
Dive into writing and running Azure Data Lake Analytics jobs.