Data structures and algorithms are among the most important inventions of the last 50 years, and they are fundamental tools software engineers need to know. But in my opinion, most of the books on these topics are too theoretical, too big, and too “bottom up”:
Mathematical analysis of algorithms is based on simplifying assumptions that limit its usefulness in practice. Many presentations of this topic gloss over the simplifications and focus on the math. In this book I present the most practical subset of this material and omit or de-emphasize the rest.
Most books on these topics are at least 500 pages, and some are more than 1,000. By focusing on the topics I think are most useful for software engineers, I kept this book under 150 pages.
Many data structures books focus on how data structures work (the implementations), with less about how to use them (the interfaces). In this book, I go “top down”, starting with the interfaces. Readers learn to use the structures in the Java Collections Framework before getting into the details of how they work.
Finally, some books present this material out of context and without motivation: it’s just one damn data structure after another! I try to liven it up by organizing the topics around an application—web search—that uses data structures extensively, and is an interesting and important topic in its own right.
I have made difficult decisions about what to leave out, but I have made some compromises. I include a few topics that most readers will never use, but that they might be expected to know, possibly in a technical interview. For these topics, I present both the conventional wisdom as well as my reasons to be skeptical.
This book also presents basic aspects of software engineering practice, including version control and unit testing. Most chapters include an exercise that allows readers to apply what they have learned. Each exercise provides automated tests that check the solution. And for most exercises, I present my solution at the beginning of the next chapter.
This book is intended for college students in computer science and related fields, as well as professional software engineers, people training in software engineering, and people preparing for technical interviews.
Before you start this book, you should know Java pretty well; in particular, you should know how to define a new class that extends an existing class or implements an
interface. If your Java is rusty, here are two books you might start with:
Downey and Mayfield, Think Java (O’Reilly Media, 2016), which is intended for people who have never programmed before
Sierra and Bates, Head First Java (O’Reilly Media, 2005), which is appropriate for people who already know another programming language
If you are not familiar with interfaces in Java, you might want to work through the tutorial called “What Is an Interface?” at http://thinkdast.com/interface.
In the context of Java, it also refers to a language feature, similar to a class, that specifies a set of methods. To help avoid confusion, I’ll use “interface” in the normal typeface for the general idea of an interface, and
interface in the code typeface for the Java language feature.
You should also be familiar with type parameters and generic types. For example, you should know how create an object with a type parameter, like
ArrayList<Integer>. If not, you can read about type parameters at http://thinkdast.com/types.
You should be familiar with the Java Collections Framework (JCF), which you can read about at http://thinkdast.com/collections. In particular, you should know about the
List interface and the classes
Ideally you should be familiar with Apache Ant, which is an automated build tool for Java. You can read more about Ant at http://thinkdast.com/anttut.
And you should be familiar with
JUnit, which is a unit testing framework for Java. You can read more about it at http://thinkdast.com/junit.
The code for this book is in a Git repository at http://thinkdast.com/repo.
You can create a copy of the repository on GitHub by pressing the Fork button. If you don’t already have a GitHub account, you’ll need to create one. After forking, you’ll have your own repository on GitHub that you can use to keep track of code you write. Then you can clone the repository, which downloads a copy of the files to your computer.
Alternatively, you could clone the repository without forking. If you choose this option, you don’t need a GitHub account, but you won’t be able to save your changes on GitHub.
If you don’t want to use Git at all, you can download the code in a ZIP archive using the Download button on the GitHub page, or this link: http://thinkdast.com/zip.
After you clone the repository or unzip the ZIP file, you should have a directory called
ThinkDataStructures with a subdirectory called
The examples in this book were developed and tested using Java SE Development Kit 7. If you are using an older version, some examples will not work. If you are using a more recent version, they should all work.
The following typographical conventions are used in this book:
Indicates emphasis, keystrokes, menu options, URLs, and email addresses.
Used for new terms where they are defined.
Used for program listings, as well as within paragraphs to refer to filenames, file extensions, and program elements such as variable and function names, data types, statements, and keywords.
Constant width bold
Shows commands or other text that should be typed literally by the user.
Technology professionals, software developers, web designers, and business and creative professionals use Safari Books Online as their primary resource for research, problem solving, learning, and certification training.
Members have access to thousands of books, training videos, and prepublication manuscripts in one fully searchable database from publishers like O’Reilly Media, Prentice Hall Professional, Addison-Wesley Professional, Microsoft Press, Sams, Que, Peachpit Press, Focal Press, Cisco Press, John Wiley & Sons, Syngress, Morgan Kaufmann, IBM Redbooks, Packt, Adobe Press, FT Press, Apress, Manning, New Riders, McGraw-Hill, Jones & Bartlett, Course Technology, and hundreds more. For more information about Safari Books Online, please visit us online.
Please address comments and questions concerning this book to the publisher:
To comment or ask technical questions about this book, send email to firstname.lastname@example.org.
For more information about our books, courses, conferences, and news, see our website at http://www.oreilly.com.
Find us on Facebook: http://facebook.com/oreilly
Follow us on Twitter: http://twitter.com/oreillymedia
Watch us on YouTube: http://www.youtube.com/oreillymedia
This book is an adapted version of a curriculum I wrote for the Flatiron School in New York City, which offers a variety of online classes related to programming and web development. They offer a class based on this material, which provides an online development environment, help from instructors and other students, and a certificate of completion. You can find more information at http://flatironschool.com.
At the Flatiron School, Joe Burgess, Ann John, and Charles Pletcher provided guidance, suggestions, and corrections from the initial specification all the way through implementation and testing. Thank you all!
I am very grateful to my technical reviewers, Barry Whitman, Patrick White, and Chris Mayfield, who made many helpful suggestions and caught many errors. Of course, any remaining errors are my fault, not theirs!
Thanks to the instructors and students in Data Structures and Algorithms at Olin College, who read this book and provided useful feedback.
Charles Roumeliotis copyedited the book for O’Reilly Media and made many improvements.
If you have comments or ideas about the text, please send them to email@example.com.