book

High Performance MySQL, 2nd Edition

Name: High Performance MySQL, 2nd Edition
ISBN: 9780596101718

by Jeremy D. Zawodny, Derek J. Balling, Baron Schwartz, Peter Zaitsev, Arjen Lentz, Vadim Tkachenko

June 2008

Intermediate to advanced

708 pages

25h 18m

English

O'Reilly Media, Inc.

Read now

Unlock full access

High Performance MySQL, 2nd Edition
Foreword
Preface
How This Book Is OrganizedA Broad OverviewBuilding a Solid FoundationTuning Your ApplicationScaling Upward After Making ChangesMaking Your Application ReliableMiscellaneous Useful TopicsSoftware Versions and AvailabilityConventions Used in This BookUsing Code ExamplesSafari® Books OnlineHow to Contact UsAcknowledgments for the Second EditionFrom BaronFrom PeterFrom VadimFrom ArjenAcknowledgments for the First EditionFrom JeremyFrom Derek
1. MySQL Architecture
MySQL’s Logical ArchitectureConnection Management and SecurityOptimization and ExecutionConcurrency ControlRead/Write LocksLock GranularityTable locksRow locksTransactionsIsolation LevelsDeadlocksTransaction LoggingTransactions in MySQLAUTOCOMMITMixing storage engines in transactionsImplicit and explicit lockingMultiversion Concurrency ControlMySQL’s Storage EnginesThe MyISAM EngineStorageMyISAM featuresCompressed MyISAM tablesThe MyISAM Merge EngineThe InnoDB EngineThe Memory EngineThe Archive EngineThe CSV EngineThe Federated EngineThe Blackhole EngineThe NDB Cluster EngineThe Falcon EngineThe solidDB EngineThe PBXT (Primebase XT) EngineThe Maria Storage EngineOther Storage EnginesSelecting the Right EngineConsiderationsPractical ExamplesLoggingRead-only or read-mostly tablesOrder processingStock quotesBulletin boards and threaded discussion forumsCD-ROM applicationsStorage Engine SummaryTable ConversionsALTER TABLEDump and importCREATE and SELECT
2. Finding Bottlenecks: Benchmarking and Profiling
Why Benchmark?Benchmarking StrategiesWhat to MeasureBenchmarking TacticsDesigning and Planning a BenchmarkGetting Accurate ResultsRunning the Benchmark and Analyzing ResultsBenchmarking ToolsFull-Stack ToolsSingle-Component ToolsBenchmarking Exampleshttp_loadsysbenchThe sysbench CPU benchmarkThe sysbench file I/O benchmarkThe sysbench OLTP benchmarkOther sysbench featuresdbt2 TPC-C on the Database Test SuiteMySQL Benchmark SuiteProfilingProfiling an ApplicationHow and what to measureA PHP profiling exampleMySQL ProfilingLogging queriesFiner control over loggingHow to read the slow query logLog analysis toolsProfiling a MySQL ServerProfiling Queries with SHOW STATUSSHOW PROFILEOther Ways to Profile MySQLWhen You Can’t Add Profiling CodeOperating System ProfilingTroubleshooting MySQL Connections and ProcessesAdvanced Profiling and Troubleshooting
3. Schema Optimization and Indexing
Choosing Optimal Data TypesWhole NumbersReal NumbersString TypesVARCHAR and CHAR typesBLOB and TEXT typesUsing ENUM instead of a string typeDate and Time TypesBit-Packed Data TypesChoosing IdentifiersSpecial Types of DataIndexing BasicsTypes of IndexesB-Tree indexesHash indexesSpatial (R-Tree) indexesFull-text indexesIndexing Strategies for High PerformanceIsolate the ColumnPrefix Indexes and Index SelectivityClustered IndexesComparison of InnoDB and MyISAM data layoutInserting rows in primary key order with InnoDBCovering IndexesUsing Index Scans for SortsPacked (Prefix-Compressed) IndexesRedundant and Duplicate IndexesIndexes and LockingAn Indexing Case StudySupporting Many Kinds of FilteringAvoiding Multiple Range ConditionsOptimizing SortsIndex and Table MaintenanceFinding and Repairing Table CorruptionUpdating Index StatisticsReducing Index and Data FragmentationNormalization and DenormalizationPros and Cons of a Normalized SchemaPros and Cons of a Denormalized SchemaA Mixture of Normalized and DenormalizedCache and Summary TablesCounter tablesSpeeding Up ALTER TABLEModifying Only the .frm FileBuilding MyISAM Indexes QuicklyNotes on Storage EnginesThe MyISAM Storage EngineThe Memory Storage EngineThe InnoDB Storage Engine
4. Query Performance Optimization
Slow Query Basics: Optimize Data AccessAre You Asking the Database for Data You Don’t Need?Is MySQL Examining Too Much Data?Execution timeRows examined and rows returnedRows examined and access typesWays to Restructure QueriesComplex Queries Versus Many QueriesChopping Up a QueryJoin DecompositionQuery Execution BasicsThe MySQL Client/Server ProtocolQuery statesThe Query CacheThe Query Optimization ProcessThe parser and the preprocessorThe query optimizerTable and index statisticsMySQL’s join execution strategyThe execution planThe join optimizerSort optimizationsThe Query Execution EngineReturning Results to the ClientLimitations of the MySQL Query OptimizerCorrelated SubqueriesWhen a correlated subquery is goodUNION limitationsIndex merge optimizationsEquality propagationParallel executionHash joinsLoose index scansMIN() and MAX()SELECT and UPDATE on the same tableOptimizing Specific Types of QueriesOptimizing COUNT() QueriesWhat COUNT() doesMyths about MyISAMSimple optimizationsMore complex optimizationsOptimizing JOIN QueriesOptimizing SubqueriesOptimizing GROUP BY and DISTINCTOptimizing GROUP BY WITH ROLLUPOptimizing LIMIT and OFFSETOptimizing SQL_CALC_FOUND_ROWSOptimizing UNIONQuery Optimizer HintsUser-Defined VariablesBe Careful with MySQL Upgrades
5. Advanced MySQL Features
The MySQL Query CacheHow MySQL Checks for a Cache HitHow the Cache Uses MemoryWhen the Query Cache Is HelpfulHow to Tune and Maintain the Query CacheReducing fragmentationImproving query cache usageInnoDB and the Query CacheGeneral Query Cache OptimizationsAlternatives to the Query CacheStoring Code Inside MySQLStored Procedures and FunctionsTriggersEventsPreserving Comments in Stored CodeCursorsPrepared StatementsPrepared Statement OptimizationThe SQL Interface to Prepared StatementsLimitations of Prepared StatementsUser-Defined FunctionsViewsUpdatable ViewsPerformance Implications of ViewsLimitations of ViewsCharacter Sets and CollationsHow MySQL Uses Character SetsDefaults for creating objectsSettings for client/server communicationHow MySQL compares valuesSpecial-case behaviorsChoosing a Character Set and CollationHow Character Sets and Collations Affect QueriesFull-Text SearchingNatural-Language Full-Text SearchesBoolean Full-Text SearchesFull-Text Changes in MySQL 5.1 and BeyondFull-Text Tradeoffs and WorkaroundsFull-Text Tuning and OptimizationForeign Key ConstraintsMerge Tables and PartitioningMerge TablesMerge table performance impactsMerge table strengthsPartitioned TablesWhy partitioning worksPartitioning examplesPartitioned table limitationsOptimizing queries against partitioned tablesDistributed (XA) TransactionsInternal XA TransactionsExternal XA Transactions
6. Optimizing Server Settings
Configuration BasicsSyntax, Scope, and DynamismSide Effects of Setting VariablesGetting StartedGeneral TuningTuning Memory UsageHow much memory can MySQL use?Per-connection memory needsReserving memory for the operating systemAllocating memory for cachesThe MyISAM Key CacheThe MyISAM key block sizeThe InnoDB Buffer PoolThe Thread CacheThe Table CacheThe InnoDB Data DictionaryTuning MySQL’s I/O BehaviorMyISAM I/O TuningInnoDB I/O TuningThe InnoDB transaction logHow InnoDB opens and flushes log and data filesThe InnoDB tablespaceThe doublewrite bufferOther I/O tuning optionsTuning MySQL ConcurrencyMyISAM Concurrency TuningInnoDB Concurrency TuningWorkload-Based TuningOptimizing for BLOB and TEXT WorkloadsOptimizing for FilesortsInspecting MySQL Server Status VariablesTuning Per-Connection Settings
7. Operating System and Hardware Optimization
What Limits MySQL’s Performance?How to Select CPUs for MySQLWhich Is Better: Fast CPUs or Many CPUs?CPU ArchitectureScaling to Many CPUs and CoresBalancing Memory and Disk ResourcesRandom Versus Sequential I/OCaching, Reads, and WritesWhat’s Your Working Set?The working set and the cache unitFinding an Effective Memory-to-Disk RatioChoosing Hard DisksChoosing Hardware for a SlaveRAID Performance OptimizationRAID Failure, Recovery, and MonitoringBalancing Hardware RAID and Software RAIDRAID Configuration and CachingThe RAID stripe chunk sizeThe RAID cacheStorage Area Networks and Network-Attached StorageStorage Area NetworksNetwork-Attached StorageUsing Multiple Disk VolumesNetwork ConfigurationChoosing an Operating SystemChoosing a FilesystemThreadingSwappingOperating System StatusHow to Read vmstat OutputHow to Read iostat OutputA CPU-Bound MachineAn I/O-Bound MachineA Swapping MachineAn Idle Machine

8. Replication
Replication OverviewProblems Solved by ReplicationHow Replication WorksSetting Up ReplicationCreating Replication AccountsConfiguring the Master and SlaveStarting the SlaveInitializing a Slave from Another ServerRecommended Replication ConfigurationReplication Under the HoodStatement-Based ReplicationRow-Based ReplicationReplication FilesSending Replication Events to Other SlavesReplication FiltersReplication TopologiesMaster and Multiple SlavesMaster-Master in Active-Active ModeMaster-Master in Active-Passive ModeMaster-Master with SlavesRingMaster, Distribution Master, and SlavesTree or PyramidCustom Replication SolutionsSelective replicationSeparating functionsData archivingUsing slaves for full-text searchesRead-only slavesEmulating multimaster replicationCreating a log serverReplication and Capacity PlanningWhy Replication Doesn’t Help Scale WritesPlan to UnderutilizeReplication Administration and MaintenanceMonitoring ReplicationMeasuring Slave LagDetermining Whether Slaves Are Consistent with the MasterResyncing a Slave from the MasterChanging MastersPlanned promotionsUnplanned promotionsLocating the desired log positionsSwitching Roles in a Master-Master ConfigurationReplication Problems and SolutionsErrors Caused by Data Corruption or LossUsing Nontransactional TablesMixing Transactional and Nontransactional TablesNondeterministic StatementsDifferent Storage Engines on the Master and SlaveData Changes on the SlaveNonunique Server IDsUndefined Server IDsDependencies on Nonreplicated DataMissing Temporary TablesNot Replicating All UpdatesLock Contention Caused by InnoDB Locking SelectsWriting to Both Masters in Master-Master ReplicationExcessive Replication LagDon’t duplicate the expensive part of writesDo writes in parallel outside of replicationPrime the cache for the slave threadOversized Packets from the MasterLimited Replication BandwidthNo Disk SpaceReplication LimitationsHow Fast Is Replication?The Future of MySQL Replication
9. Scaling and High Availability
TerminologyScaling MySQLPlanning for ScalabilityBuying Time Before ScalingScaling UpScaling OutFunctional partitioningData shardingChoosing a partitioning keyQuerying across shardsAllocating data, shards, and nodesFixed allocationDynamic allocationExplicit allocationRebalancing shardsGenerating globally unique IDsTools for shardingScaling BackKeeping active data separateScaling by ClusteringClusteringFederationLoad BalancingConnecting DirectlySplitting reads and writes in replicationChanging the application configurationChanging DNS namesMoving IP addressesIntroducing a MiddlemanLoad balancersLoad-balancing algorithmsAdding and removing servers in the poolLoad Balancing with a Master and Multiple SlavesHigh AvailabilityPlanning for High AvailabilityAdding RedundancyShared-storage architecturesReplicated-disk architecturesSynchronous MySQL replicationFailover and FailbackPromoting a slave or switching rolesVirtual IP addresses or IP takeoverThe MySQL Master-Master Replication ManagerMiddleman solutionsHandling failover in the application
10. Application-Level Optimization
Application Performance OverviewFind the Source of the ProblemLook for Common ProblemsWeb Server IssuesFinding the Optimal ConcurrencyCachingCaching Below the ApplicationApplication-Level CachingCache Control PoliciesCache Object HierarchiesPregenerating ContentExtending MySQLAlternatives to MySQL
11. Backup and Recovery
OverviewTerminologyIt’s All About RecoveryTopics We Won’t CoverThe Big PictureWhy Backups?Considerations and TradeoffsWhat Can You Afford to Lose?Online or Offline Backups?Logical or Raw Backups?Logical backupsRaw backupsWhat to Back UpIncremental backupsStorage Engines and ConsistencyData consistencyFile consistencyReplicationManaging and Backing Up Binary LogsThe Binary Log FormatPurging Old Binary Logs SafelyBacking Up DataMaking a Logical BackupSQL dumpsDelimited file backupsParallel dump and restoreFilesystem SnapshotsHow LVM snapshots workPrerequisites and configurationCreating, mounting, and removing an LVM snapshotLVM snapshots for online backupsLock-free InnoDB backups with LVM snapshotsPlanning for LVM backupsOther uses and alternativesRecovering from a BackupLimiting Access to MySQLRestoring Raw FilesStarting MySQL after restoring raw filesRestoring Logical BackupsLoading SQL filesLoading delimited filesPoint-in-Time RecoveryMore Advanced Recovery TechniquesDelayed replication for fast recoveryRecovering with a log serverInnoDB RecoveryCauses of InnoDB corruptionHow to recover corrupted InnoDB dataBackup and Recovery SpeedBackup ToolsmysqldumpmysqlhotcopyInnoDB Hot Backupmk-parallel-dumpmylvmbackupZmanda Recovery ManagerInstalling and testing ZRMR1SoftMySQL Online BackupComparison of Backup ToolsScripting Backups
12. Security
TerminologyAccount BasicsPrivilegesThe Grant TablesHow MySQL Checks PrivilegesAdding, Removing, and Viewing GrantsSetting Up MySQL PrivilegesPrivilege Changes in MySQL 4.1Privilege Changes in MySQL 5.0Stored routinesTriggersViewsPrivileges on the INFORMATION_SCHEMA tablesPrivileges and PerformanceCommon Problems and SolutionsConnecting through localhost versus 127.0.0.1Using temporary tables safelyDisallowing passwordless accessDisabling anonymous usersRemember to quote hostnames separatelyDon’t reuse usernamesGranting SELECT allows SHOW CREATE TABLEDon’t grant privileges on the mysql databaseDon’t grant the SUPER privilege freelyGranting privileges on wildcarded databasesRevoking specific privilegesUsers can connect even after REVOKEWhen you can’t grant or revoke a privilegeInvisible privilegesObsolete privilegesOperating System SecurityGuidelinesNetwork SecurityLocalhost-Only ConnectionsFirewallingNo default routeMySQL in a DMZConnection Encryption and TunnelingVirtual private networksSSL in MySQLSSH tunnelingTCP WrappersAutomatic Host BlockingData EncryptionHashing PasswordsEncrypted FilesystemsApplication-Level EncryptionDesign issuesEncrypting and decrypting inside MySQLSource Code ModificationMySQL in a chrooted Environment
13. MySQL Server Status
System VariablesSHOW STATUSThread and Connection StatisticsBinary Logging StatusCommand CountersTemporary Files and TablesHandler OperationsMyISAM Key BufferFile DescriptorsQuery CacheSELECT TypesSortsTable LockingSecure Sockets Layer (SSL)InnoDB-SpecificPlug-in-SpecificMiscellaneousSHOW INNODB STATUSHeaderSEMAPHORESLATEST FOREIGN KEY ERRORLATEST DETECTED DEADLOCKTRANSACTIONSFILE I/OINSERT BUFFER AND ADAPTIVE HASH INDEXLOGBUFFER POOL AND MEMORYROW OPERATIONSSHOW PROCESSLISTSHOW MUTEX STATUSReplication StatusINFORMATION_SCHEMA
14. Tools for High Performance
Interface ToolsMySQL Visual ToolsSQLyogphpMyAdminMonitoring ToolsNoninteractive Monitoring SystemsHomegrown systemsNagiosAlternatives to NagiosMySQL Monitoring and Advisory ServiceMONyogRRDTool-based systemsInteractive ToolsinnotopAnalysis ToolsHackMySQL ToolsMaatkit Analysis ToolsMySQL UtilitiesMySQL ProxyDormando’s Proxy for MySQLMaatkit UtilitiesSources of Further Information
A. Transferring Large Files
Copying FilesA Naive ExampleA One-Step MethodAvoiding Encryption OverheadOther OptionsFile Copy Benchmarks
B. Using EXPLAIN
Invoking EXPLAINRewriting Non-SELECT QueriesThe Columns in EXPLAINThe id ColumnThe select_type ColumnThe table ColumnDerived tables and unionsAn example of complex SELECT typesThe type ColumnThe possible_keys ColumnThe key ColumnThe key_len ColumnThe ref ColumnThe rows ColumnThe filtered ColumnThe Extra ColumnVisual EXPLAIN
C. Using Sphinx with MySQL
Overview: A Typical Sphinx SearchWhy Use Sphinx?Efficient and Scalable Full-Text SearchingApplying WHERE Clauses EfficientlyFinding the Top Results in OrderOptimizing GROUP BY QueriesGenerating Parallel Result SetsScalingAggregating Sharded DataArchitectural OverviewInstallation OverviewTypical Partition UseSpecial FeaturesPhrase Proximity RankingSupport for AttributesFilteringThe SphinxSE Pluggable Storage EngineAdvanced Performance ControlPractical Implementation ExamplesFull-Text Searching on Mininova.orgFull-Text Searching on BoardReader.comOptimizing Selects on Sahibinden.comOptimizing GROUP BY on BoardReader.comOptimizing Sharded JOIN Queries on Grouply.comConclusion
D. Debugging Locks
Lock Waits at the Server LevelTable LocksFinding out who holds a lockThe Global Read LockName LocksUser LocksLock Waits in Storage EnginesInnoDB Lock WaitsToward more usable lock outputFalcon Lock Waits
Index
About the Authors
Colophon
Copyright

Content preview from High Performance MySQL, 2nd Edition

Chapter 1. MySQL Architecture

MySQL’s architecture is very different from that of other database servers, and makes it useful for a wide range of purposes. MySQL is not perfect, but it is flexible enough to work well in very demanding environments, such as web applications. At the same time, MySQL can power embedded applications, data warehouses, content indexing and delivery software, highly available redundant systems, online transaction processing (OLTP), and much more.

To get the most from MySQL, you need to understand its design so that you can work with it, not against it. MySQL is flexible in many ways. For example, you can configure it to run well on a wide range of hardware, and it supports a variety of data types. However, MySQL’s most unusual and important feature is its storage-engine architecture, whose design separates query processing and other server tasks from data storage and retrieval. In MySQL 5.1, you can even load storage engines as runtime plug-ins. This separation of concerns lets you choose, on a per-table basis, how your data is stored and what performance, features, and other characteristics you want.

This chapter provides a high-level overview of the MySQL server architecture, the major differences between the storage engines, and why those differences are important. We’ve tried to explain MySQL by simplifying the details and showing examples. This discussion will be useful for those new to database servers as well as readers who are experts with other ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9780596101718Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

High Performance MySQL, 2nd Edition

by Jeremy D. Zawodny, Derek J. Balling, Baron Schwartz, Peter Zaitsev, Arjen Lentz, Vadim Tkachenko

Chapter 1. MySQL Architecture

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.