Errata

Database Internals

Errata for Database Internals

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released. If the error was corrected in a later version or reprint the date of the correction will be displayed in the column titled "Date Corrected".

The following errata were submitted by our customers and approved as valid errors by the author or editor.

Color key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version Location Description Submitted By Date submitted Date corrected
PDF
Page xix
Penultimate paragraph

"reviewd" should be "reviewed".

Oleksandr Petrov
 
Nov 08, 2019  Jan 28, 2020
1
2

First: I read Safari book online and there is no page and the paragraph is like 67 or smth which make required fields `page` and `location` stupid.

Second: In the first chapter there is `Row-Oriented Data Layout` section with last sentence `The two pioneer open source column-oriented stores are [MonetDB and C-Store (C-Store is an open source predecessor to Vertica).`
This sentence is a bit confusing since there is the whole section about column oriented stores following.

Note from the Author or Editor:
Thank you, moved this paragraph to the general section to make it more clear.

Max  Aug 09, 2019  Sep 12, 2019
1
78

First chapter the section `Index Files`.
In the sentence `If the order of data records follows the search key order, this index it is called clustered (better known as clustering).` the pronoun `it` after index is redundant.

Max  Aug 09, 2019  Sep 12, 2019
1
99

Chapter 1 footnote 1: `Disk, memory access latency, and many other numbers have been represented and visualized over the years for comparison (see https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html) ... `

The link contains closing bracket, which leads to `https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html)` resource not found

Max  Aug 09, 2019  Sep 12, 2019
Printed
Page 14
4th paragraph

Sentence:
"Values belonging to the same row are stored closely together"
should be:
"Values belonging to the same column are stored closely together"

Note from the Author or Editor:
Column-Oriented Data Layout, p. 10. “Values belonging to the same row are stored closely together” should read as “Values belonging to the same column are stored closely together:”

Daniel Iwan  Aug 04, 2020  Sep 04, 2020
, Printed, PDF, ePub, Mobi, , Other Digital Version
Page 19
4th paragraph

The explanation of Figure 1-5 doesn't match the figure. It seems same as the explanation of Figure 1-6 on the next page.

Note from the Author or Editor:
Text before the Figure 1-5 should be:

* a) An index-organized table, where data records are stored directly in the index file.
* b) An index file storing the offsets and a separate file storing data records.

Liang He  Nov 24, 2019  Jan 28, 2020
Printed
Page 19
2nd complete paragraph

[MOLINA08] is referenced, but is not in the list of sources at the end of the book.

Note from the Author or Editor:
Some of the references appear to be not sorted alphabetically. Even though all of them are present, they are hard to find in a printed version. All references that were present in the wrong order are added below in alphabetical order:

[ALHOUMAILY10] Yousef J. Al-Houmaily. 2010. Atomic commit protocols, their integration, and their optimisations in distributed database systems. Int. J. Intell. Inf. Database Syst. 4, 4 (September 2010), 373-412.

[BAUDET19] "State Machine Replication in the Libra Blockchain". Mathieu Baudet, Avery Ching, Andrey Chursin, George Danezis, François Garillot, Zekun Li, Dahlia Malkhi, Oded Naor, Dmitri Perelman, Alberto Sonnino. 2019

[BUCHMAN18] "The latest gossip on BFT consensus". Ethan Buchman, Jae Kwon, Zarko Milosevic. 2018.

[CESATI05] "Understanding the Linux Kernel, 3rd Edition". Marco Cesati, Daniel P. Bovet. 2005.

[CLOCK06] "Implementing a Better Cache Replacement Algorithm in Apache Derby". Gokul Soundararajan. 2006.

[CORMODE11] Cormode, Graham and S. Muthukrishnan. “Approximating Data with the Count-Min Data Structure.” (2011).

[DECHEV10] Damian Dechev, Peter Pirkelbauer, and Bjarne Stroustrup. 2010. Understanding and Effectively Preventing the ABA Problem in Descriptor-Based Lock-Free Designs. In Proceedings of the 2010 13th IEEE International Symposium on Objec/Component/Service-Oriented Real-Time Distributed Computing (ISORC '10). IEEE Computer Society, Washington, DC, USA, 185-192.

[DOWNEY12] "Be Careful with Sloppy Quorums". Jim Downey. 2012.

[FOWLER11] "The LMAX Architecture". Martin Fowler. 2011

[FREILING11] Felix C. Freiling, Rachid Guerraoui, and Petr Kuznetsov. 2011. The failure detector abstraction. ACM Comput. Surv. 43, 2, Article 9 (February 2011), 40 pages.

[GIAMPAOLO98] Dominic Giampaolo. 1998. Practical File System Design with the be File System (1st ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.

[GILAD17] Yossi Gilad, Rotem Hemo, Silvio Micali, Georgios Vlachos, and Nickolai Zeldovich. 2017. Algorand: Scaling Byzantine Agreements for Cryptocurrencies. In Proceedings of the 26th Symposium on Operating Systems Principles, Shanghai , China, October 28-31, 2017. 51–68.

[GRAY05] Gray, Jim; van Ingen, Catharine (December 2005). "Empirical Measurements of Disk Failure Rates and Error Rates" (PDF). Microsoft Research Technical Report MSR-TR-2005-166. Retrieved 4 March 2013.

[HYPARVIEW07] "HyParView: a membership protocol for reliable gossip-based broadcast". Joao Leitao, Jose Pereira, and Luis Rodrigues. 2007.

[KERRISK10] "The Linux Programming Interface". Michael Kerrisk. 2010. No Starch Press.

[KINGSBURY18_2] "Strong consistency models". Kyle Kingsbury. 2018.

[KLEPPMANN15] "Please stop calling databases CP or AP". Martin Kleppmann. 2015.

[KOOPMAN15] "Selection of Cyclic Redundancy Code and Checksum Algorithms to Ensure Critical Data Integrity". Philip Koopman, Kevin R. Driscoll, Brendan Hall. 2015.

[KORDAFSHARI05] M. S. Kordafshari, M. Gholipour, M. Mosakhani, A. T. Haghighat, and M. Dehghan. 2005. Modified bully election algorithm in distributed systems. 2005

[MERKLE87] Ralph C. Merkle. 1987. A Digital Signature Based on a Conventional Encryption Function. In A Conference on the Theory and Applications of Cryptographic Techniques on Advances in Cryptology (CRYPTO '87), Carl Pomerance (Ed.). Springer-Verlag, London, UK, UK, 369-378.

[MOLINA92] H. Garcia-Molina and K. Salem. 1992. Main Memory Database Systems: An Overview. IEEE Trans. on Knowl. and Data Eng. 4, 6 (December 1992), 509-516.

[NICHOLS66] The Past Participle of "Overflow": "Overflowed" or "Overflown". Ann Eljenholm Nichols. American Speech Vol. 41, No. 1 (Feb., 1966), pp. 52-55

[STONE98] "Performance of checksums and CRCs over real data". J. Stone, M. Greenwald, C. Partridge and J. Hughes, in IEEE/ACM Transactions on Networking, vol. 6, no. 5, pp. 529-543, Oct. 1998.

[SYSTEMR81] Donald D. Chamberlin, Morton M. Astrahan, Michael W. Blasgen, James N. Gray, W. Frank King, Bruce G. Lindsay, Raymond Lorie, James W. Mehl, Thomas G. Price, Franco Putzolu, Patricia Griffiths Selinger, Mario Schkolnick, Donald R. Slutz, Irving L. Traiger, Bradford W. Wade, and Robert A. Yost. 1981. A history and evaluation of System R. Commun. ACM 24, 10 (October 1981), 632-646.

[VINOSKI08] S. Vinoski, "Convenience Over Correctness" in IEEE Internet Computing, vol. 12, no. 04, pp. 89-92, 2008.

[ZHAO15] "Fast Paxos Made Easy: Theory and Implementation". Wenbing Zhao. 2015.

Ben Krug  Jan 02, 2020  Jan 28, 2020
PDF
Page 21
Figure 1-6 (b)

Figure 1-6 (b) Use of Primary Index as an indirection layer by Secondary Index is inconsistent with Secondary Index Offsets in Figure 1-6 (a).

Assuming that both primary and secondary indices are zero-based, in Figure 1-6 (b) Entry 1 of secondary index points to Entry 0 of primary index. This is inconsistent with Figure 1-6 (a), where Entry 1 of secondary index points to Data Record 3, which maps back to Entry 3 of primary index.

Similarly, in Figure 1-6 (b) Entry 2 of secondary index points to Entry 3 of primary index. This is inconsistent with Figure 1-6 (a), where Entry 2 of secondary index points to Data Record 0, which maps back to Entry 0 of primary index.

Thus, it appears that in Figure 1-6 (b), primary index indirection for secondary index entries 1 and 2 have been interchanged.

Note from the Author or Editor:
Assuming primary and secondary index nodes in example b) are 0-indexed, node 1 of secondary index should point to node 3 of primary index, and node 2 of secondary index should point to node 0 of primary index. In other words, entries 1 and 2 in a secondary index were interchanged.

Swapnil Gandhi  Jun 20, 2020  Sep 04, 2020
, Printed, PDF, ePub, Mobi, , Other Digital Version
Page 39
3rd paragraph

In B-Tree node splitting, the split point should only be put into the newly created sibling node for leaf node splits, not nonleaf node splits, yet this paragraph says "All elements after the split point (including split point in the case of nonleaf node split) are transferred to the newly created sibling node, ..." I believe that this should be changed to "All elements after the split point (including the split point in the case of leaf node split) are transferred to the newly created sibling node, ..." Namely, "nonleaf node split" --> "leaf node split".

Note from the Author or Editor:
B-Tree Node Splits, p. 39. Sentence on elements that should be transferred during split contains a typo, and should read as (correction is highlighted): “All elements after the split point (including split point in the case of _leaf node_ split) are transferred to the newly created sibling node.”

Clark Zinzow  Aug 14, 2020  Sep 04, 2020
Printed
Page 40
Figure 2-12

In Figure 2-12, the key 10 is a separation key in both the parent node and the depicted child node. The key range that corresponds to the child node is k >= 10. However, there is a child node pointer (the leftmost pointer in the child node) whose key range is k < 10, which would violate the k >= 10 constraint defined by the parent node for this subtree. The subtree pointed to by the leftmost pointer in the child node would necessarily have to be empty, which wouldn't be a well-defined state for a B-Tree.

It may be more well-defined (or at least less confusing) to change the separation key 10 in the parent to 8. That way, the child could still have a separation key 10 with a corresponding k < 10 pointer to a nonempty subtree (could contain the key 9) without violating any key range constraints defined by the parent node's separation keys.

Note from the Author or Editor:
This is right: during the child split, midpoint is promoted, so having the same item repeated twice on two nonleaf levels is indeed incorrect.

Clark Zinzow  Aug 14, 2020  Sep 04, 2020
42
42

Chapter 2, section `Tree Balancing`.
In the sentence `If the tree is not balanced, worst-case complexity goes down to O(N) ...`
Isn't complexity increasing in worst-case scenario? I think it should be `complexity goes up to O(N)`.

Note from the Author or Editor:
Should be "goes up"

Max  Aug 09, 2019  Sep 12, 2019
42
42

Chapter 2, section `Hard Disk Drives`:
there is references `[XIA17] [KANNAN18]` which are not linked to bibliography (#223 and #123 respectively).

Note from the Author or Editor:
Should be linked to bibliography.

Max  Aug 09, 2019  Sep 12, 2019
42
42

Chapter 2, section `B-Tree Hierarchy`
Sentence `Balancing operations (namely, splits and merges_)` the underscore is a typo.

Note from the Author or Editor:
Underscore should be removed.

Max  Aug 09, 2019  Sep 12, 2019
42
42

Chapter 5, section `READERS-WRITER LOCK`
Sentence `In Figure 5-8 (b), writer 1 holds an exclusive lock on the object, while another writer and two readers have to wait.`
there is three readers on the pic.

Note from the Author or Editor:
Should be "writer and three readers".

Max  Aug 11, 2019  Sep 12, 2019
42
42

Chapter 6, section `Implementing Copy-on-Write: LMDB`
The first two sentences of the first paragraph are essentially the same:
`One of the storage engines using copy-on-write is LMDB. One of the storage engines using copy-on-write is the Lightning Memory-Mapped Database (LMDB), which is a key-value store used by the OpenLDAP project. `

Note from the Author or Editor:
Thank you for spotting it and submitting! Fixed now.

Max  Aug 12, 2019  Sep 12, 2019
42
42

Chapter 6, section `WiredTiger`.
Sentence `Dirty pages have have an update buffer in addition to that.` have `have` duplication

Note from the Author or Editor:
Thank you for submitting it! Fixed now.

Max  Aug 12, 2019  Sep 12, 2019
42
7

Chapter 7, paragraph 8:
`Pages are fixed ins size, and some free space is reserved for the future writes.` typo ins, should be in.

Note from the Author or Editor:
Thank you, this one was fixed during copyedit.

Max  Aug 13, 2019  Sep 12, 2019
42
42

Chapter 7, section `Merge-Iteration`
Sentence `Let’s follow though one example step by step.` should probably have `through` instead of `though`.

Note from the Author or Editor:
Thank you for submitting it. Fixed now.

Max  Aug 13, 2019  Sep 12, 2019
42
42

Chapter 8, section `Clocks and Time`, third paragraph.
Sentence `Besides the the fact that clock synchronization in a distributed system is hard ...` "the" is repeated twice

Note from the Author or Editor:
"the the" should be "the", but this is already fixed in the latest version of the draft.

Max  Aug 17, 2019  Sep 12, 2019
42
42

Chapter 11, section `Linearizability`
In sentence `There is some indeterminism in linearizability, as there may exist than one way in which the events can be ordered` "more" is missing. It seems it should be "as there may exist more than one"...

Note from the Author or Editor:
"exist then" should be "exist more than". But it's already fixed in the latest draft.

Max  Aug 19, 2019  Sep 12, 2019
42
42

Chapter 11, section `Eventual Consistency`
Sentence `Formally, it states that if no additional updates are to the data item, eventually all accesses return the latest written value` is missing a verb. Probably it should be `... if no additional updates are done to the data ...`

Note from the Author or Editor:
"are to the data item" should be "are performed against the data item"

Max  Aug 20, 2019  Sep 12, 2019
42
42

Chapter 14, section `Flexible Paxos`
In sentence `Since the second phase is usually more common than the first one, Q2 can contain only N/2 acceptors, as long as Q1 is adjusted to be correspondingly larger (Q1 = N - Q2 + 1.` there is no closing brace ) at the end

Max  Aug 21, 2019  Sep 12, 2019
42
42

Chapter 14, section `Raft`
In sentence `It may happen that different participants disagree on which term is current, since they can find out about the new term at different times, or be have missed the leader election, one or multiple terms.` the "be have missed" should be "have missed"

Max  Aug 21, 2019  Sep 12, 2019
PDF
Page 96
3rd paragraph

"commts" should be "commits"

Oleksandr Petrov
 
Nov 08, 2019  Jan 28, 2020
, Printed, PDF, ePub, Mobi, , Other Digital Version
Page 99
the last paragraph on this page

When explaining timestamp ordering, it says:

> Whether or not transaction operations are allowed to be executed is determined by whether or not any transaction with an *earlier* timestamp has already been committed.

From my understanding of timestamp ordering, the word "earlier" should be corrected as "later"

Note from the Author or Editor:
"earlier" should be "later"

Eric Fu  Dec 01, 2019  Jan 28, 2020
PDF
Page 171
the code following the 2nd paragraph

The variable should be named 'x' instead of 'i'

Note from the Author or Editor:
Thank you for reporting!


Introduction and Overview, p. 171. In code snippet, variable “i” should be “x”. Full code snippet should be as follows:

int x = 1;
x += 2;
x *= 2;

Eric Fu  Dec 07, 2019  Jan 28, 2020
PDF
Page 190
The 3rd paragraph under section "System Synchrony"

> Moreover, designing an efficient synchronous algorithm is not always
achievable

The word 'synchronous' should be 'asynchronous'

Note from the Author or Editor:
System Synchrony, p. 190. “designing an efficient synchronous algorithm is not always achievable” should read as “designing an efficient asynchronous algorithm is not always achievable”.

Eric Fu  Dec 22, 2019  Jan 28, 2020
PDF
Page 274
Figure 13-9

In step c of figure 13-9, 'Primary at Account1' should not appear here

Note from the Author or Editor:
Distributed Transactions with Percolator, p. 274, Figure 13-9. In section c) of the figure, “Primary at Account 1” should not be visible.

Alternatively, under assumption that an immutable store is used, “Primary” mark should be added to TS3 under Account 1. Both implementations would be correct given the resolution algorithm.

Eric Fu  Jan 15, 2020  Jan 29, 2020
Printed
Page 349
write-ahead log (WAL)

The first error is a typo for the page number of the first usage of "write-ahead log (WAL)". The first time "write-ahead log" is referenced is on the very first line of page 84, not page 83. The index erroneously marks the reference as page 83.

The second error is a missing reference of when "write-ahead log (WAL)" is used again. It is also referenced on page 132, in the third paragraph of the section titled "LSM Tree Structure". The index does not mention page 132.

Sachandhan Ganesh  Dec 20, 2019  Jan 28, 2020