book

TensorFlow 2 Pocket Reference

Name: TensorFlow 2 Pocket Reference
Author: KC Tung
ISBN: 9781492089186

by KC Tung

July 2021

Intermediate to advanced

253 pages

5h 1m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Preface
Conventions Used in This BookUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgments
1. Introduction to TensorFlow 2
Improvements in TensorFlow 2Keras APIReusable Models in TensorFlowMaking Commonly Used Operations EasyOpen Source DataWorking with Distributed DatasetsData StreamingData EngineeringTransfer LearningModel StylesMonitoring the Training ProcessDistributed TrainingServing Your TensorFlow ModelImproving the Training ExperienceWrapping Up
2. Data Storage and Ingestion
Streaming Data with Python GeneratorsStreaming File Content with a GeneratorJSON Data StructuresSetting Up a Pattern for FilenamesSplitting a Single CSV File into Multiple CSV FilesCreating a File Pattern Object Using tf.ioCreating a Streaming Dataset ObjectStreaming a CSV DatasetOrganizing Image DataUsing TensorFlow Image GeneratorStreaming Cross-Validation ImagesInspecting Resized ImagesWrapping Up
3. Data Preprocessing
Preparing Tabular Data for TrainingMarking ColumnsEncoding Column Interactions as Possible FeaturesCreating a Cross-Validation DatasetStarting the Model Training ProcessSummaryPreparing Image Data for ProcessingTransforming Images to a Fixed SpecificationTraining the ModelSummaryPreparing Text Data for ProcessingTokenizing TextCreating a Dictionary and Reverse DictionaryWrapping Up
4. Reusable Model Elements
The Basic TensorFlow Hub WorkflowImage Classification by Transfer LearningModel RequirementsData Transformation and Input ProcessingModel Implementation with TensorFlow HubDefining the OutputMapping Output to Plain-Text FormatEvaluation: Creating a Confusion MatrixSummaryUsing the tf.keras.applications Module for Pretrained ModelsModel Implementation with tf.keras.applicationsFine-Tuning Models from tf.keras.applicationsWrapping Up
5. Data Pipelines for Streaming Ingestion
Streaming Text Files with the text_dataset_from_directory FunctionDownloading Text Data and Setting Up DirectoriesCreating the Data PipelineInspecting the DatasetSummaryStreaming Images with a File List Using the flow_from_dataframe MethodDownloading Images and Setting Up DirectoriesCreating the Data Ingestion PipelineInspecting the DatasetBuilding and Training the tf.keras ModelStreaming a NumPy Array with the from_tensor_slices MethodLoading Example Data and LibrariesInspecting the NumPy ArrayBuilding the Input Pipeline for NumPy DataWrapping Up
6. Model Creation Styles
Using the Symbolic APILoading the CIFAR-10 ImagesInspecting Label DistributionInspecting ImagesBuilding a Data PipelineBatching the Dataset for TrainingBuilding the ModelUnderstanding InheritanceUsing the Imperative APIDefining a Model as a ClassChoosing the APIUsing the Built-In Training LoopCreating and Using a Custom Training LoopCreating the Elements of the LoopPutting the Elements Together in a Custom Training LoopWrapping Up
7. Monitoring the Training Process
Callback ObjectsModelCheckpointEarlyStoppingSummaryTensorBoardInvoking TensorBoard by Local Jupyter NotebookInvoking TensorBoard by Local Command TerminalInvoking TensorBoard by Colab NotebookVisualizing Model Overfitting Using TensorBoardVisualizing the Learning Process Using TensorBoardWrapping Up
8. Distributed Training
Data ParallelismAsynchronous Parameter ServerSynchronous AllreduceUsing the Class tf.distribute.MirroredStrategySetting Up Distributed TrainingUsing a GPU Cluster with tf.distribute.MirroredStrategySummaryThe Horovod APICode Pattern for Implementing the Horovod APIEncapsulating the Model ArchitectureEncapsulating the Data Separation and Sharding ProcessesParameter Synchronization Among WorkersModel Checkpoint as a CallbackDistributed Optimizer for Gradient AggregationDistributed Training Using the Horovod APIWrapping Up
9. Serving TensorFlow Models
Model SerializationSaving a Model to h5 FormatSaving a Model to pb FormatSelecting the Model FormatTensorFlow ServingRunning TensorFlow Serving with a Docker ImageWrapping Up

10. Improving the Modeling Experience: Fairness Evaluation and Hyperparameter Tuning
Model FairnessModel Training and ScoringFairness EvaluationRendering Fairness IndicatorsHyperparameter TuningInteger Lists as HyperparametersItem Choice as HyperparametersFloating-Point Values as HyperparametersEnd-to-End Hyperparameter TuningImport Libraries and Load DataWrapping Up
Index

Content preview from TensorFlow 2 Pocket Reference

Chapter 2. Data Storage and Ingestion

To envision how to set up an ML model to solve a problem, you have to start thinking about data structure patterns. In this chapter, we’ll look at some general patterns in storage, data formats, and data ingestion. Typically, once you understand your business problem and set it up as a data science problem, you have to think about how to get the data into a format or structure that your model training process can use. Data ingestion during the training process is fundamentally a data transformation pipeline. Without this transformation, you won’t be able to deliver and serve the model in an enterprise-driven or use-case-driven setting; it would remain nothing more than an exploration tool and would not be able to scale to handle large amounts of data.

This chapter will show you how to design a data ingestion pipeline for two common data structures: tables and images. You will learn how to make the pipeline scalable by using TensorFlow’s APIs.

Data streaming is the means by which the data is ingested in small batches by the model for training. Data streaming in Python is not a new concept. However, grasping it is fundamental to understanding how the more advanced APIs in TensorFlow work. Thus, this chapter will start with Python generators. Then we’ll look at how tabular data is stored, including how to indicate and track features and labels. We’ll then move to designing your data structure, and finish by discussing how to ingest data to your ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781492089179Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

TensorFlow 2 Pocket Reference

by KC Tung

Chapter 2. Data Storage and Ingestion

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.