book

AWS Cookbook

by John Culkin, Mike Zazon

December 2021

Intermediate to advanced

358 pages

7h 41m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Includes

Sandbox

Foreword
Preface
Who This Book Is ForWhat You Will LearnThe RecipesWhat You Will NeedGetting StartedConventions Used in This BookUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgments
1. Security
1.0. Introduction1.1. Creating and Assuming an IAM Role for Developer Access1.2. Generating a Least Privilege IAM Policy Based on Access Patterns1.3. Enforcing IAM User Password Policies in Your AWS Account1.4. Testing IAM Policies with the IAM Policy Simulator1.5. Delegating IAM Administrative Capabilities Using Permissions Boundaries1.6. Connecting to EC2 Instances Using AWS SSM Session Manager1.7. Encrypting EBS Volumes Using KMS Keys1.8. Storing, Encrypting, and Accessing Passwords Using Secrets Manager1.9. Blocking Public Access for an S3 Bucket1.10. Serving Web Content Securely from S3 with CloudFront
2. Networking
2.0. Introduction2.1. Defining Your Private Virtual Network in the Cloud by Creating an Amazon VPC2.2. Creating a Network Tier with Subnets and a Route Table in a VPC2.3. Connecting Your VPC to the Internet Using an Internet Gateway2.4. Using a NAT Gateway for Outbound Internet Access from Private Subnets2.5. Granting Dynamic Access by Referencing Security Groups2.6. Using VPC Reachability Analyzer to Verify and Troubleshoot Network Paths2.7. Redirecting HTTP Traffic to HTTPS with an Application Load Balancer2.8. Simplifying Management of CIDRs in Security Groups with Prefix Lists2.9. Controlling Network Access to S3 from Your VPC Using VPC Endpoints2.10. Enabling Transitive Cross-VPC Connections Using Transit Gateway2.11. Peering Two VPCs Together for Inter-VPC Network Communication
3. Storage
3.0. Introduction3.1. Using S3 Lifecycle Policies to Reduce Storage Costs3.2. Using S3 Intelligent-Tiering Archive Policies to Automatically Archive S3 Objects3.3. Replicating S3 Buckets to Meet Recovery Point Objectives3.4. Observing S3 Storage and Access Metrics Using Storage Lens3.5. Configuring Application-Specific Access to S3 Buckets with S3 Access Points3.6. Using Amazon S3 Bucket Keys with KMS to Encrypt Objects3.7. Creating and Restoring EC2 Backups to Another Region Using AWS Backup3.8. Restoring a File from an EBS Snapshot3.9. Replicating Data Between EFS and S3 with DataSync
4. Databases
4.0. Introduction4.1. Creating an Amazon Aurora Serverless PostgreSQL Database4.2. Using IAM Authentication with an RDS Database4.3. Leveraging RDS Proxy for Database Connections from Lambda4.4. Encrypting the Storage of an Existing Amazon RDS for MySQL Database4.5. Automating Password Rotation for RDS Databases4.6. Autoscaling DynamoDB Table Provisioned Capacity4.7. Migrating Databases to Amazon RDS Using AWS DMS4.8. Enabling REST Access to Aurora Serverless Using RDS Data API
5. Serverless
5.0. Introduction5.1. Configuring an ALB to Invoke a Lambda Function5.2. Packaging Libraries with Lambda Layers5.3. Invoking Lambda Functions on a Schedule5.4. Configuring a Lambda Function to Access an EFS File System5.5. Running Trusted Code in Lambda Using AWS Signer5.6. Packaging Lambda Code in a Container Image5.7. Automating CSV Import into DynamoDB from S3 with Lambda5.8. Reducing Lambda Startup Times with Provisioned Concurrency5.9. Accessing VPC Resources with Lambda
6. Containers
6.0. Introduction6.1. Building, Tagging, and Pushing a Container Image to Amazon ECR6.2. Scanning Images for Security Vulnerabilities on Push to Amazon ECR6.3. Deploying a Container Using Amazon Lightsail6.4. Deploying Containers Using AWS Copilot6.5. Updating Containers with Blue/Green Deployments6.6. Autoscaling Container Workloads on Amazon ECS6.7. Launching a Fargate Container Task in Response to an Event6.8. Capturing Logs from Containers Running on Amazon ECS
7. Big Data
7.0. Introduction7.1. Using a Kinesis Stream for Ingestion of Streaming Data7.2. Streaming Data to Amazon S3 Using Amazon Kinesis Data Firehose7.3. Automatically Discovering Metadata with AWS Glue Crawlers7.4. Querying Files on S3 Using Amazon Athena7.5. Transforming Data with AWS Glue DataBrew
8. AI/ML
8.0. Introduction8.1. Transcribing a Podcast8.2. Converting Text to Speech8.3. Computer Vision Analysis of Form Data8.4. Redacting PII from Text Using Comprehend8.5. Detecting Text in a Video8.6. Physician Dictation Analysis Using Amazon Transcribe Medical and Comprehend Medical8.7. Determining Location of Text in an Image

9. Account Management
9.0. Introduction9.1. Using EC2 Global View for Account Resource Analysis9.2. Modifying Tags for Many Resources at One Time with Tag Editor9.3. Enabling CloudTrail Logging for Your AWS Account9.4. Setting Up Email Alerts for Root Login9.5. Setting Up Multi-Factor Authentication for a Root User9.6. Setting Up AWS Organizations and AWS Single Sign-On
Fast Fixes
Index
About the Authors

Content preview from AWS Cookbook

Chapter 7. Big Data

7.0 Introduction

Data is sometimes referred to as “the new gold.” Many companies are leveraging data in new and exciting ways every day as available data science tools continue to improve. You can now mine troves of historical data quickly for insights and patterns by using modern analytics tools. You might not yet know the queries and analysis you need to run against the data, but tomorrow you might be faced with a challenge that could be supported by historical data analysis using new and emerging techniques. With the advent of cheaper data storage, many organizations and individuals opt to keep data rather than discard it so that they can run historical analysis to gain business insights, discover trends, train AI/ML models, and be ready to implement future technologies that can use the data.

In addition to the amount of data you might collect over time, you are also collecting a wider variety of data types and structures at an increasingly faster velocity. Imagine that you might deploy IoT devices to collect sensor data, and as you continue to deploy these over time, you need a way to capture and store the data in a scalable way. This can be structured, semistructured, and unstructured data with schemas that might be difficult to predict as new data sources are ingested. You need tools to be able to transform and analyze your diverse data.

An informative and succinct AWS re:Invent 2020 presentation by Francis Jayakumar, “An Introduction to Data Lakes and ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781492092599Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

AWS Cookbook

by John Culkin, Mike Zazon

Chapter 7. Big Data

7.0 Introduction

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.