book

Beautiful Code

by Andy Oram, Greg Wilson

June 2007

Intermediate to advanced

618 pages

18h 15m

English

O'Reilly Media, Inc.

Read now

Unlock full access

How This Book Is OrganizedConventions Used in This BookUsing Code ExamplesHow to Contact UsSafari® Enabled
1.1. The Practice of Programming1.2. Implementation1.3. Discussion1.4. Alternatives1.5. Building on It1.6. Conclusion
2.1. Version Control and Tree Transformation2.2. Expressing Tree Differences2.3. The Delta Editor Interface2.4. But Is It Art?2.5. Abstraction As a Spectator Sport2.6. Conclusions
3.1. The Most Beautiful Code I Ever Wrote3.2. More and More with Less and Less3.3. Perspective3.3.1. A Bonus Analysis3.4. What Is Writing?3.5. Conclusion3.6. Acknowledgments
4.1. On Time4.2. Problem: Weblog Data4.2.1. Regular Expressions4.2.2. Putting Regular Expressions to Work4.2.3. Content-Addressable Storage4.2.4. Time to Optimize?4.3. Problem: Who Fetched What, When?4.3.1. Binary Search4.3.2. Binary Search Trade-offs4.3.3. Escaping the Loop4.4. Search in the Large4.4.1. Searching with Postings4.4.2. Ranking Results4.4.3. Searching the Web4.5. Conclusion
5.1. The Role of XML Validation5.2. The Problem5.3. Version 1: The Naïve Implementation5.4. Version 2: Imitating the BNF Grammar O(N)5.5. Version 3: First Optimization O(log N)5.6. Version 4: Second Optimization: Don’t Check Twice5.7. Version 5: Third Optimization O(1)5.8. Version 6: Fourth Optimization: Caching5.9. The Moral of the Story
6.1. An Acceptance Testing Framework in Three Classes6.2. The Challenge of Framework Design6.3. An Open Framework6.4. How Simple Can an HTML Parser Be?6.5. Conclusion

7.1. That Pesky Binary Search7.2. Introducing JUnit7.3. Nailing Binary Search7.3.1. Smoking Allowed (and Encouraged)7.3.2. Pushing the Boundaries7.3.3. Random Acts of Testing7.3.4. Performance Anxiety7.4. Conclusion
9.1. JavaScript9.2. Symbol Table9.3. Tokens9.4. Precedence9.5. Expressions9.6. Infix Operators9.7. Prefix Operators9.8. Assignment Operators9.9. Constants9.10. Scope9.11. Statements9.12. Functions9.13. Array and Object Literals9.14. Things to Do and Think About
10.1. Basic Methods10.2. Divide and Conquer10.3. Other Methods10.4. Sum and Difference of Population Counts of Two Words10.5. Comparing the Population Counts of Two Words10.6. Counting the 1-Bits in an Array10.7. Applications
11.1. The Heart of the Start11.2. Untangling the Complexity of Secure Messaging11.3. Usability Is the Key11.4. The Foundation11.4.1. Design Goals and Decisions11.4.2. Basic System Design11.5. The Test Suite11.6. The Functioning Prototype11.7. Clean Up, Plug In, Rock On…11.7.1. Revamping the Mail Store11.7.2. Persistence of Decryption11.8. Hacking in the Himalayas11.8.1. Securing the Code11.8.2. Auditing Crypt::GPG11.9. The Invisible Hand Moves11.10. Speed Does Matter11.11. Communications Privacy for Individual Rights11.12. Hacking the Civilization
12.1. BioPerl and the Bio::Graphics Module12.1.1. Example of Bio::Graphics Output12.1.2. Bio::Graphics Requirements12.2. The Bio::Graphics Design Process12.2.1. Designing How the Developer Interacts with the Module12.2.2. Setting Options12.2.3. Choosing Object Classes12.2.4. Option Processing12.2.5. Code Example12.2.6. Dynamic Options12.3. Extending Bio::Graphics12.3.1. Supporting Web Developers12.3.2. Supporting Publication-Quality Images12.3.3. Adding New Glyphs12.4. Conclusions and Lessons Learned
13.1. The User Interface of the Gene Sorter13.2. Maintaining a Dialog with the User over the Web13.3. A Little Polymorphism Can Go a Long Way13.4. Filtering Down to Just the Relevant Genes13.5. Theory of Beautiful Code in the Large13.6. Conclusion
14.1. The Effects of Computer Architectures on Matrix Algorithms14.2. A Decompositional Approach14.3. A Simple Version14.4. LINPACK’s DGEFA Subroutine14.5. LAPACK DGETRF14.6. Recursive LU14.7. ScaLAPACK PDGETRF14.8. Multithreading for Multi-Core Systems14.9. A Word About the Error Analysis and Operation Count14.10. Future Directions for Research14.11. Further Reading
15.1. My Idea of Beautiful Code15.2. Introducing the CERN Library15.3. Outer Beauty15.4. Inner Beauty15.4.1. Beauty in Brevity and Simplicity15.4.2. Beauty in Frugality15.4.3. Beauty in Flow15.5. Conclusion
16.1. Humble Beginnings16.2. Reduced to Even Smaller Bits16.3. Scaling Up to Thousands of Devices16.4. Small Objects Loosely Joined
17.1. From Code to Pointers17.2. From Function Arguments to Argument Pointers17.3. From Filesystems to Filesystem Layers17.4. From Code to a Domain-Specific Language17.5. Multiplexing and Demultiplexing17.6. Layers Forever?
18.1. Inside the Dictionary18.2. Special Accommodations18.2.1. A Special-Case Optimization for Small Hashes18.2.2. When Special-Casing Is Worth the Overhead18.2.2.1. The Java implementation: another special-case optimization18.2.2.2. The C implementation: selecting the storage function dynamically18.3. Collisions18.4. Resizing18.4.1. Determining the New Table Size18.4.2. A Memory Trade-Off That’s Worth It: The Free List18.5. Iterations and Dynamic Changes18.6. Conclusion18.7. Acknowledgments
19.1. Key Challenges in N-Dimensional Array Operations19.2. Memory Models for an N-Dimensional Array19.3. NumPy Iterator Origins19.4. Iterator Design19.4.1. Iterator Progression19.4.2. Iterator Termination19.4.3. Iterator Setup19.4.4. Iterator Counter Tracking19.4.5. Iterator Structure19.5. Iterator Interface19.6. Iterator Use19.6.1. Iteration Over All But One Dimension19.6.2. Multiple Iterations19.6.3. Anecdotes19.7. Conclusion
20.1. The Mission and the Collaborative Information Portal20.2. Mission Needs20.3. System Architecture20.4. Case Study: The Streamer Service20.4.1. Functionality20.4.2. Service Architecture20.5. Reliability20.5.1. Logging20.5.2. Monitoring20.6. Robustness20.6.1. Dynamic Reconfiguration20.6.2. Hot Swapping20.7. Conclusion
21.1. General Goals of ERP21.2. ERP521.3. The Underlying Zope Platform21.4. ERP5 Project Concepts21.5. Coding the ERP5 Project21.6. Conclusion21.6.1. Acknowledgments
23.1. A Motivating Example23.2. The MapReduce Programming Model23.3. Other MapReduce Examples23.4. A Distributed MapReduce Implementation23.4.1. Execution Overview23.5. Extensions to the Model23.6. Conclusion23.7. Further Reading23.8. Acknowledgments23.9. Appendix: Word Count Solution
24.1. A Simple Example: Bank Accounts24.1.1. Bank Accounts Using Locks24.1.2. Locks Are Bad24.2. Software Transactional Memory24.2.1. Side Effects and Input/Output in Haskell24.2.2. Transactions in Haskell24.2.3. Implementing Transactional Memory24.2.4. Blocking and Choice24.2.5. Summary of Basic STM Operations24.3. The Santa Claus Problem24.3.1. Reindeer and Elves24.3.2. Gates and Groups24.3.3. The Main Program24.3.4. Implementing Santa24.3.5. Compiling and Running the Program24.4. Reflections on Haskell24.5. Conclusion24.6. Acknowledgments
25.1. Brief Introduction to syntax-case25.2. Expansion Algorithm25.2.1. Representations25.2.2. Producing Expander Output25.2.3. Stripping Syntax Objects25.2.4. Syntax Errors25.2.5. Structural Predicates25.2.6. Creating Wraps25.2.7. Manipulating Environments25.2.8. Identifier Resolution25.2.9. The Expander25.2.10. Core Transformers25.2.11. Parsing and Constructing Syntax Objects25.2.12. Comparing Identifiers25.2.13. Conversions25.2.14. Starting Expansion25.3. Example25.4. Conclusion
26.1. Sample Application: Logging Service26.2. Object-Oriented Design of the Logging Server Framework26.2.1. Understanding the Commonalities26.2.2. Accommodating Variation26.2.3. Tying It All Together26.3. Implementing Sequential Logging Servers26.3.1. An Iterative Logging Server26.3.2. A Reactive Logging Server26.3.3. Evaluating the Sequential Logging Server Solutions26.4. Implementing Concurrent Logging Servers26.4.1. A Thread-per-Connection Logging Server26.4.2. A Process-per-Connection Logging Server26.4.3. Evaluating the Concurrent Logging Server Solutions26.5. Conclusion
27.1. Project Background27.2. Exposing Services to External Clients27.2.1. Define the Service Interface27.3. Routing the Service Using the Factory Pattern27.4. Exchanging Data Using E-Business Protocols27.4.1. Parsing the XML Using XPath27.4.2. Assembling the XML Response27.5. Conclusion
28.1. Debugging a Debugger28.2. A Systematic Process28.3. A Search Problem28.4. Finding the Failure Cause Automatically28.5. Delta Debugging28.6. Minimizing Input28.7. Hunting the Defect28.8. A Prototype Problem28.9. Conclusion28.10. Acknowledgments28.11. Further Reading
30.1. Basic Design Model30.2. Input Interface30.2.1. The Tree30.2.2. The Long Click30.2.3. Dynamic Tree Repopulation30.2.4. Simple Typing30.2.5. Prediction: Word Completion and Next Word30.2.6. Templates and Replace30.2.7. The Cache Implementation30.2.8. Common Words and Favorites30.2.9. Retracing Paths30.2.10. The Typing Buffer, Editing, and Scrolling30.2.11. The Clipboard30.2.12. Searching30.2.13. Macros30.3. Efficiency of the User Interface30.4. Download30.5. Future Directions
31.1. Producing Spoken Output31.2. Speech-Enabling Emacs31.2.1. A Simple First-Cut Implementation31.2.2. Iterating on the First-Cut Implementation31.2.3. A Brief advice Tutorial31.2.4. Generating Rich Auditory Output31.2.4.1. Audio formatting using voice-lock31.2.4.2. Augmenting Emacs to create aural display lists31.2.4.3. Audio-formatted output from aural display lists31.2.5. Using Aural CSS (ACSS) for Styling Speech Output31.2.6. Adding Auditory Icons31.2.7. Producing Auditory Icons While Speaking Content31.2.8. The Calendar: Enhancing Spoken Output with Context-Sensitive Semantics31.3. Painless Access to Online Information31.3.1. Basic HTML with Emacs W3 and Aural CSS31.3.2. The emacspeak-websearch Module for Task-Oriented Search31.3.3. The Web Command Line and URL Templates31.3.4. The Advent of Feed Readers31.4. Summary31.4.1. Managing Code Complexity Over Time31.4.2. Conclusion31.5. Acknowledgments
32.1. On Being “Bookish”32.2. Alike Looking Alike32.3. The Perils of Indentation32.4. Navigating Code32.5. The Tools We Use32.6. DiffMerge’s Checkered Past32.7. Conclusion32.8. Acknowledgments32.9. Further Reading
33.1. The Nonroyal Road33.2. Warning to Parenthophobes33.3. Three in a Row33.4. The Slippery Slope33.5. The Triangle Inequality33.6. Meandering On33.7. “Duh!”—I Mean “Aha!”33.8. Conclusion33.9. Further Reading

Content preview from Beautiful Code

Chapter 4. Finding Things

Tim Bray

Computers can compute, but that’s not what people use them for, mostly. Mostly, computers store and retrieve information. Retrieve implies find, and in the time since the advent of the Web, search has become a dominant application for people using computers.

As data volumes continue to grow—both absolutely, and relative to the number of people or computers or anything, really—search becomes an increasingly large part of the life of the programmer as well. A few applications lack the need to locate the right morsel in some information store, but very few.

The subject of search is one of the largest in computer science, and thus I won’t try to survey all of it or discuss the mechanics; in fact, I’ll only consider one simple search technique in depth. Instead, I’ll focus on the trade-offs that go into selecting search techniques, which can be subtle.

On Time

You really can’t talk about search without talking about time. There are two different flavors of time that apply to problems of search. The first is the time it takes the search to run, which is experienced by the user who may well be staring at a message saying something like “Loading…”. The second is the time invested by the programmer who builds the search function, and by the programmer’s management and customers waiting to use the program.

Problem: Weblog Data

Let’s look at a sample problem to get a feel for how a search works in real life. I have a directory containing logfiles from my weblog ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Start your free trial

Publisher Resources

ISBN: 9780596510046Supplemental Content Errata Page

Beautiful Code

by Andy Oram, Greg Wilson

Chapter 4. Finding Things

On Time

Problem: Weblog Data

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

You might also like

Good Code, Bad Code

The Art of Clean Code

Strange Code

Five Lines of Code

Publisher Resources

Chapter 4. Finding Things

On Time

Problem: Weblog Data

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,and much more.

You might also like

Good Code, Bad Code

The Art of Clean Code

Strange Code

Five Lines of Code

Publisher Resources

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.