O'Reilly    
 Published on O'Reilly (http://www.oreilly.com/)
 http://www.oreillynet.com/pub/a/oreilly/news/udell_0301.html
 See this if you're having trouble printing code examples


Interview with Jon Udell aboutSafari: O'Reilly Books Online

by Bruce Stewart
03/20/2001

Jon Udell is an independent consultant and freelance writer, and the author of Practical Internet Groupware. He was Byte Magazine's executive editor for new media, and the architect of the original Byte.com Web site. Jon is also spearheading O'Reilly & Associates' new venture into electronic publishing, called Safari. We recently asked Jon to update us on this exciting new technology.

Stewart: Let's start at the beginning. What is Safari?

Udell: It's a Web site, safari.oreilly.com, where you can explore the O'Reilly collection, order printed books, and sign up to use the books online.

Stewart: Explain how Safari differs from the eBook model.

Udell: In the eBook model, typically, you buy a book, own it forever, and use it offline. In Safari, you subscribe to an information service. You rent the books, and you use them online. But you're not committed to any fixed selection of books. Think of your subscription like bandwidth. You can pay a little for a skinny pipe into Safari, or more for a fat pipe. What runs through the pipe is up to you.

Stewart: What motivated O'Reilly to pursue Safari?

Udell: In conversations with its customers, O'Reilly found they report a strong bimodal pattern to their use of technical information. People at work, in the heat of problem-solving, need to search for quick answers to tactical problems. Then they'll go home and spend hours reading a book cover to cover to enlarge their technical worldview. Books are ideal for this latter purpose, and until there's some radical advance in display technology I doubt that will change. But books in printed form don't support intensive, goal-directed research as effectively as they can in electronic form. Safari aims to change that.

Stewart: What functionality is available to the general browsing public? Will Safari eventually replace O'Reilly's online catalog?

Udell: Safari actually serves two different, though overlapping, audiences. For subscribers, it's a tool for searching and reading the full contents of O'Reilly books. For the general public, it's a tool for finding and previewing books. The UI (user interface) is the same, except that we truncate the sections for the simple preview function. In search mode, the elements include a summary of the hits, the first paragraph, and--uniquely to Safari--all the index terms that refer to that search. So you can get a very clear idea about whether you'd like to buy, or subscribe to, any Safari book. In some cases, the preview mode is rich enough to answer basic research questions directly. For example, Safari's search results report full sentences surrounding hits. I've noticed lately that if I'm trying to remember the parameters to a Perl function, I can find the answer in the preview mode without even logging in to Safari.

In its nonsubscriber mode, Safari overlaps with the online catalog at oreilly.com. Over time, as more and eventually all O'Reilly books become available in Safari, it will replace the catalog functions of oreilly.com. Until then the two will coexist, and will be connected in various ways. As a first step, I wrote a wrapper for oreilly.com's Atomz search engine that consolidates its results into a Safari-like presentation. The next steps will be to put Safari "pawprint" icons onto the oreilly.com catalog page, to indicate the books available in Safari, and to redirect the book-searching component of oreilly.com into Safari for dramatically more complete results.

To do that search integration, by the way, we'll offer an open XML interface to Safari's search. The first users of that interface will be O'Reilly and O'Reilly Network, but it'll be available to anyone. I would hope that sites like Amazon will want to use such interfaces. It's a bit odd, I think, that Amazon can tell you what people have said about a book, but they can't show you much of what's actually in it.

Stewart: Safari's subscription model includes the ability to swap titles. How does that work?

Udell: Let's say you sign up for the five-book plan, at $9.95 per month. Every 30 days, you can exercise your swap option. That means you can drop some, or all, of your books, to make room for other books that you'd rather use. If you exercise your swap option aggressively--and of course, you're not required to--you could end up using 60 books in a year's time, though never more than five at once.

It's a pretty radical scheme! We'll see how it plays out. Will people tune in and then drop out? Clearly O'Reilly has to keep filling the pipeline with new and useful stuff. But, of course, it has to do that in any case.

There's an on-demand aspect to Safari that's worth mentioning too. You don't have to wait for your swap date to roll around in order to bump up your level, say from 10 books to 20. Suppose you're not a Java whiz and suddenly you find yourself on a Java project. You can ratchet up your Safari subscription, and cram a bunch of Java content. A few months later, when the crunch is over, you can drop back down to your normal ten-book level.

Here's a tip for managing your Safari subscription. When you sign up, say for a ten-book subscription, you don't have to fill it right away. You might want to leave a few holes for books you don't yet know that you'll need. That way, you can satisfy an unexpected information need on demand without having to upgrade to the next level. And it's instant gratification. If it's 2 A.M. and you suddenly realize that you need that tenth book, you'll have it a few seconds later.

Stewart: Tim O'Reilly has described Safari as a front end for the content of books that truly harnesses the power of the Web. I understand Safari can help users find information across O'Reilly's entire catalog--will this concept of a "meta-book" change the way people use technical literature?

Udell: I've already talked about the bimodal pattern--tactical research versus strategic reading--and how Safari optimizes the content for tactical research. Beyond this, Safari opens a two-way channel between the producers and the consumers of the content. And it can connect the consumers to one another. You can annotate Safari sections, and you can opt to share these annotations with the Safari community. Will user-contributed content become a significant part of the Safari experience? We hope so, and we're going to do the experiment and find out.

I do think that Safari is bound to influence the way books are conceived and written. A book has to stand alone, but every book also plays a role in the broader information architecture of the O'Reilly publishing program. Safari helps bring that broader view into focus. The relationship between the book and the entire collection becomes much more apparent to the reader. For the author and editor, Safari will be a power tool that helps them maximize the synergy of the collection as a whole. That kind of synergy is what has always excited me most about electronic publishing.

Stewart: Obviously copyright laws weren't designed with systems like Safari in mind. How will O'Reilly define "fair use" for the digital versions of the books available online?

Udell: If you subscribe to Safari you'll read the terms of service. It's pretty straightforward. You can read stuff, and you can quote limited amounts of it. You're not entitled to use a Web spider to capture large amounts of stuff for offline use or distribution, and accounts can be terminated if these abuses occur.

That said, Tim O'Reilly is taking a rather bold stance with Safari. We don't force you to use a proprietary viewer, which is how "digital rights management" is typically enforced in the eBooks space. You just use your regular Web browser. Nor do we try to cripple the browser by, for example, requiring JavaScript and using it to disable the print and save functions. Tim doesn't want to have an adversarial relationship with customers. He wants them to be able to use Safari books in the easy and natural ways they expect to. He is, frankly, taking a risk in making content on Safari as easily available as it is. To an extraordinary degree, he is trusting customers to do the right thing.

One of the things that bothers Tim about current digital rights technology is the inability to model real-world fair use. I can, for example, lend you my hard copy of Advanced Perl Programming. We'd like to be able to do that in Safari, too. If I were to lend you my copy of that electronic book, it would become unavailable to me until you returned it. This feature isn't in the first version of Safari, but we'd love to add it.

Stewart: What has your role been in the Safari project?

Udell: I helped O'Reilly evaluate several prospective development partners, and introduced them to Bureau van Dijk (BvD), the company that O'Reilly finally chose to work with. Then I prototyped the Safari UI, and worked with BvD and O'Reilly to elaborate on the UI and the business model. I also worked with them to define the architecture that connects the book server, hosted by BvD, to the shopping cart and the back-end business system hosted by O'Reilly.

Stewart: Why was Bureau van Dijk chosen to help develop this application? Has that been a good relationship?

Udell: I had worked with BvD in the past. We collaborated on the BYTE Magazine CD-ROM some years ago. They're a great team with deep roots in electronic publishing--they've been there from the beginning. And they're always up for a challenge that forces them to learn something new. When we started this project a year ago, I had some doubts about the maturity of XSLT. But BvD was keen to push forward on that front, and now we're glad that they did.

Stewart: Describe the technology that's being used to make Safari work.

Udell: Safari books are stored internally in XML, using a lightweight version of the DocBook format. The XML is kept in an SQL database. It's transformed for HTML delivery using XSLT, and this transformation is cached for reuse. There's another on-the-fly XML-to-HTML transformation that happens when you search, in order to highlight the hits. The search engine is BvD's own, not a commercial one.

The book server is, in turn, connected to O'Reilly's business system by way of an XML-RPC interface. That's how Safari tells the back-end that, for example, you've upgraded your subscription, and it has to adjust your monthly billing. This Web-enabling of the O'Reilly back-end wasn't just for Safari. The new oreilly.com shopping cart uses it too. For O'Reilly's IT team, this was a major step forward; previously, real-time information about print book inventory wasn't available to Web buyers.

Stewart: O'Reilly is known as an advocate for the Open Source Movement, and is the leader in open source documentation. Why not stick to open source tools for this project?

Udell: We'd rather have used open source technology for Safari. But, as it turns out, both the O'Reilly shopping cart and the BvD book server are built on Windows 2000 and Internet Information Server 5. This understandably surprises people. In fact, there's no Microsoft conspiracy going on here. These were two independent platform choices, made by two different development teams, for two different sets of reasons.

In the case of the shopping cart, the module that makes it possible to Web-enable O'Reilly's legacy back-end system happens to be available only for Windows, so that dictated the choice. There's also open source technology in the mix, by the way. The oreilly.com site, still the primary interface to the shopping cart, runs on Apache and Perl. The shopping cart uses a MySQL database to do some behind-the-scenes lookups on data that's staged from the legacy system. On the whole, O'Reilly's technology choices are as eclectic as its content offerings are.

In the case of the book server, O'Reilly wound up partnering with BvD, who happens to use Microsoft technology. It's worth noting that other prospective partners did, too. And, in fact, one prospective partner who used a Unix/Java platform when O'Reilly evaluated them later switched to a Microsoft platform. So there's an object lesson in basing business decisions--like the choice of an application service provider--on platform ideology.

What matters, in the era of Web Services, isn't how those services are built. Rather, it's that they exist, are of high quality, and support the relevant interfaces needed by the consumers of those services.

Stewart: There is a widely held perception that publishing books electronically should be much cheaper than putting out traditional bound-paper versions. Is that true, and how will author compensation work with Safari?

Udell: There's a common misperception that, because there are no printing and shipping charges, ebooks should be less expensive than print books. Yet, these functions account for only about 15 percent of a book's cost. And publishers still have to convert the formats for electronic display and pay the distributors and retailers their share. So I doubt we'll see significant savings in ebooks in the near future.

Because the Safari service is a reference tool that's updated frequently (Tim calls it a "content dial tone"), subscribers get a lot of value for a small monthly fee. It's affordable, it makes better use of Web technology than ebooks, and it's flexible.

Author's royalties are the same for Safari as they are for print books. Books in the subscription are weighted (relative to each other) according to the list price. Then we calculate royalties based on the number of books in the subscription each month.

Stewart: What was the biggest technical challenge in implementing Safari?

Udell: Search and navigation. It all comes down to search and navigation, and it's just really hard to get the UI right. I'm still not completely satisfied, to be honest, and I expect we'll continue to tweak things as we go forward. But I think we've got the right framework in place. A lot of editorial work goes into the structuring of that content, and we try to leverage that work as much as possible. So, for example, when you search, you're not sent off to some separate, unrelated search-results page. You stay right in the Safari UI, with the same category/book/chapter/section views and controls. Search is only filter, a way to focus on a subset of the content. I've already mentioned that we show index terms that refer to sections. The search engine is also sensitive to index terms, and thus leverages the work of human indexers who add so much value to the printed books.

There were also the normal logistical challenges you'll find in any distributed project. BvD is in Brussels and New York, O'Reilly is in Sebastopol and Cambridge, and I'm in New Hampshire. There's email and the telephone, of course, and we've used a few kinds of issue-tracking software, but like everyone in this situation we find ourselves wishing for better collaborative tools.

Going forward, the major challenge is figuring out how to configure Safari for institutional use. For example, should a user of a site-licensed version of Safari have to identify to the system, or not? If you do, you can have personalized features. If you don't, your company will be spared a huge administrative headache. There's no single right answer. The trade-offs are complex, and we're talking with prospective corporate users in order to sort out what's the right way to proceed.


Corporations interested in site licenses should contact organizations@oreilly.com for more information.


Bruce Stewart is a Web writer and editor for oreilly.com. Bruce's work has also appeared in the Industry Standard, ZDNet, and Web Tools.

Copyright © 2007 O'Reilly Media, Inc.