Building a Modern Data Platform with Snowflake
A guide to getting off the ground
Data is so critical to modern business it is often referred to as ‘The New Oil’. However, as those involved with the oil industry know, it is only after the pipelining, refining, and delivery mechanisms are built that oil has actual value. Data is no different and can often become a company liability versus an asset. This course will teach you how to get a developer-friendly, highly scalable data platform off the ground and help you turn your data back into an asset.
Snowflake is a modern data warehouse that is built for cloud-scale workloads. For the enterprise, it’s robust, highly scalable, and secure. For the startup, it’s extremely cost-efficient, easily managed, and will scale effortlessly as your needs dictate. Like any system, there are many aspects to consider while setting it up for long-term success. This course will teach you to do just that.
What you'll learn-and how you can apply it
- Why choose Snowflake?
- How to provision a snowflake cluster.
- How to isolate computer resources and establish a clean separation of concerns.
- How to create databases, schemas, and tables.
- How to stage, load, and unload data.
- How to keep your data secure with network policies, role-based access control, data encryption, and MFA.
- How to explore and query data via Snowflake’s query UI.
- You’ll learn industry best-practices for structuring data warehouses.
This training course is for you because...
- You’ve been tasked to build an analytics platform and want to set it up for success.
- You lead an analytics team and are evaluating Snowflake as your data warehousing solution.
- You’re a software engineer, data engineer, or data scientist looking to leverage a massively parallel cloud data warehouse.
- Your existing cloud data warehousing solution isn’t scaling as promised and you’re evaluating Snowflake as an alternative.
- A working knowledge of SQL.
- A working knowledge of analytics architecture.
- A need for a performant, highly scalable, cost-effective analytics solution.
You can download and configure snowsql (snowflake's command-line interface): https://docs.snowflake.com/en/user-guide/snowsql-install-config.html
- (Book) Sams Teach Yourself SQL in 10 Minutes a Day, 5th Edition by Ben Forta https://learning.oreilly.com/library/view/sams-teach-yourself/9780135182925/
- (Book) Designing Data-Intensive Applications by Martin Kleppmann https://learning.oreilly.com/library/view/designing-data-intensive-applications/9781491903063/
- Foundations for Architecting Data Solutions by Ted Malaska, Jonathan Seidman https://learning.oreilly.com/library/view/foundations-for-architecting/9781492038733/ - The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition https://learning.oreilly.com/library/view/the-data-warehouse/9781118530801/
- Building a Scalable Data Warehouse with Data Vault 2.0 by Daniel Linstedt https://learning.oreilly.com/library/view/building-a-scalable/9780128026489/
About your instructor
Jacob is the lead data engineer at Cargurus, an industry-leading automotive marketplace. He currently leads high-volume data pipelining and warehousing efforts and has also played a lead role in building out the current data platform with Snowflake. Prior to Cargurus, Jacob built out the analytics and data pipelining stack at Wanderu and did similar work at Safari Books Online/O’Reilly before that. He’s helped numerous startups and businesses modernize their data operations along the way.
The timeframes are only estimates and may vary according to how the class is progressing
Schedule Segment 1: Why Snowflake? (30 minutes)
- Developer productivity and happiness
Exercise/Activity I - Audience Discussion, Setting the Stage (15 minutes)
- What would you do if your engineers could spend their time building, instead of doing database administration?
- What do you expect in an analytics platform?
- (If applicable) What are some ways your analytics stack has struggled to scale?
Break (15 minutes)
Snowflake Database Fundamentals (45 minutes)
- Users and roles
- Databases, schemas, tables, oh my!
- Internal and external stages
- Loading data (copy from, copy into, etc)
- Unloading data
Exercise/Activity II - Hands-On Snowflake (15 minutes)
Break (15 minutes)
Snowflake Security (15 minutes)
- Discuss the validations (Soc I and Soc II, HIPAA, PCI DSS)
- Access control (RBAC, DAC)
- Network security (institute network policy but discuss private links)
- Data at rest (discuss encryption and rekeying)
- MFA (and SCIM)
Snowflake UI (15 minutes)
- Walk through new Snowflake UI
- Data discovery
- Sharable filters
- Sharable worksheets
Industry Best-Practices/ ‘Where to go from here’ (15 minutes)
- ELT vs ETL
- ‘Raw’ vs ‘modeled’ separation
- Production-ready system architecture
- Roles, policies, access control for sensitive data
- Data modeling and visualization