Skip to Content
数据工程之道:设计和构建健壮的数据系统
book

数据工程之道:设计和构建健壮的数据系统

by Joe Reis, Matt Housley
February 2024
Intermediate to advanced
370 pages
7h
Chinese
China Machine Press
Content preview from 数据工程之道:设计和构建健壮的数据系统
18
|
1
1.2
数据工程技能和活动
数据工程师的技能集包含数据工程的“底层设计”:安全、数据管理、
DataOps
、数据
架构和软件工程。该技能集需要了解如何评估数据工具以及它们如何在整个数据工程生
命周期中相互配合。了解源系统中数据的生成方式,以及分析师和数据科学家在处理和
管理数据后如何使用和创造价值也很重要。最后,数据工程师要兼顾许多复杂的移动部
件,并且必须沿着成本、敏捷性、可扩展性、简单性、复用性和互操作性等轴线不断优
化(如图
1-7
所示)。我们将在接下来的章节中更详细地介绍这些主题。
成本 敏捷性 可扩展性 简单性 复用性 互操作性
1-7:数据工程的平衡行为
正如我们所讨论的,就在不久前,数据工程师需要知道并理解如何使用少数强大的庞大
技术(
Hadoop
Spark
Teradata
Hive
等)来创建数据解决方案。使用这些技术通常需
要对软件工程、网络、分布式计算、存储或其他底层细节有深入的了解。他们的工作将
致力于集群管理和维护、管理开销、写管道和转换作业,以及其他任务。
如今,数据工具环境的管理和部署复杂性大大降低。现代数据工具大大地抽象和简化了
工作流。因此,数据工程师现在专注于平衡能够为企业带来价值的最简单、最具成本效
益的最佳服务。数据工程师还需要创建随着新趋势的出现而发展的敏捷数据架构。
数据工程师不做哪些事情?数据工程师通常不直接构建
ML
模型、创建报告或仪表板、
执行数据分析、构建关键绩效指标(
KPI
)或开发软件应用程序。数据工程师应该对这
些领域有很好的理解,以便更好地为利益相关者提供服务。 ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

设计数据密集型应用程序

设计数据密集型应用程序

Martin Kleppmann
Understanding DeFi

Understanding DeFi

Alexandra Damsker
INSPIRED

INSPIRED

Marty Cagan

Publisher Resources

ISBN: 9787111745273