Foreword by Rick Sears
Data has become a central part of building modern software applications and growing modern data-driven organizations. Data engineers, data administrators, data analysts, and data scientists are among the individuals in these organizations who want to make more use of their data. Many of these data practitioners choose to build their data-driven applications on Amazon Web Services (AWS), often choosing to store their data in a data lake based on Amazon Simple Storage Service (S3).
These customers may want to change and manipulate their data over time while still making use of the data while it’s changing and, therefore, build their applications with support for transactional data lake technologies. Apache Iceberg is a key technology used by AWS customers building transactional data lakes because it is fast, efficient, and reliable at scale while also offering simple integrations with popular data processing frameworks running on AWS such as Apache Spark, Apache Flink, Apache Hive, Presto, Trino, Dremio, and more, as well as supported by AWS services such as Amazon EMR, Amazon Redshift, Amazon Athena, AWS Glue, and others.
Apache Iceberg: The Definitive Guide has a focus on practical applications and scenarios useful for data practitioners using Apache Iceberg and has hands-on exercises that include using Iceberg with key AWS technologies, such as Amazon EMR and AWS Glue, supporting Iceberg-specific optimizations that make it simple to build and scale applications ...