Welcome, dear reader, to our book on eXist. Whether you have purchased, begged, borrowed, or stolen this book, we hope that you find its contents of great use when applied to solving your information management problems.
While it’s true that eXist has been around for some years now—in fact, for longer than many of the now popular NoSQL platforms—eXist has continued to innovate and evolve. eXist, while stable and widely used for many years, has now hit a milestone in its history where it can be considered “battle-worn”—a veteran, if you like (or as we like to say in software engineering, “mature”). We have considered writing a book on eXist for the past few years, but we now know that the time is right to share our knowledge with the world. Welcome eXist 2.0.
Perhaps we should first answer this question with another question: Who is eXist for?
eXist aims to meet the requirements of a wide user base, and therefore is probably the most feature-rich product in its class. eXist has been engineered over the years to meet the needs of users ranging from humanities students and professors undertaking interesting linguistic projects, to large international publishers working with millions of documents, to developers wishing to rapidly create document- and data-driven web applications, and most cases in between.
This book aims to meet the needs of a wide audience: from tinkerers, students, professors, and information managers right up to software engineers. This book assumes that you wish to learn and use eXist; if not, you may have bought the wrong book! No familiarity with eXist is assumed; we start with the basics and progresses to more complicated topics. This book does not set out to teach XML, XPath, XQuery, XSLT, XForms, or any of the other XML technologies. While of course you may gain an understanding of them from this book, there are other books and online resources available that focus on these topics as their raison d'être. We assume that you have a working knowledge of, or access to learning resources for, XML technologies.
As always, beginners should start at the beginning, while those who already have some experience with eXist may find new insights in Chapters 4 to 6 onward. We hope you will find the book an excellent reference resource.
Should you be looking for books on XML technologies, in our experience and from the feedback of colleagues and beginners we have met, it is a good idea to have a copy of XQuery by Priscilla Walmsley (O’Reilly) at hand, as XQuery is the predominant language used for working with eXist. For further useful resources, see “Additional Resources”.
The following typographical conventions are used in this book:
Indicates new terms, URLs, email addresses, file- and pathnames, database collections, and file extensions.
Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, module names, data types, environment variables, statements, and keywords. Also used for commands and command-line output, database user and group names, and permission modes.
Constant width italic
Shows text that should be replaced with user-supplied values or by values determined by context.
While $EXIST_HOME typically follows the Unix-like syntactical expression of an environment variable, it is used throughout the book to refer to the location where you have installed eXist, whether that be on a Windows/Linux/Mac or any other type of system. The corresponding expression for referencing the equivalent environment variable on Windows platforms would be %EXIST_HOME%.
This element signifies a tip or suggestion.
This element signifies a general note.
This element indicates a warning or caution.
A library module does not have a query body and must start with a module declaration. Again, simply put, this means that an XQuery processor cannot directly evaluate a library module; rather, the library module must be directly or indirectly imported into a main module.
As a result, there has been a proliferation of different filename extensions used for XQuery files, including .xq, .xql, .xqm, .xqy, .xql, .xqws, and .xquery. Each XQuery implementation vendor, and even individual XQuery developers, seem to have their own ideas about XQuery file naming. Some projects differentiate between main and library modules by using two different file extensions, but which two is entirely inconsistent across projects. Other projects opt to use a single file extension and apply it to both main and library modules. This proliferation of different file extensions can be disorienting and leads to confusion when you’re approaching an existing code base.
eXist recognizes and supports XQuery files with any of the aforementioned file extensions, and will load and store them correctly into its database as XQuery. However, we believe that such an accumulation of different file extensions for what is effectively one or two (main and library) types of file is ridiculous and raises the barrier to truly reusable and portable XQuery code within projects, between projects, and across XQuery implementations.
This book takes the strong opinion that the following XQuery file extension convention should be used by at least all users of eXist, if not all XQuery developers:
This convention is justified by the following points:
The ability to differentiate between main modules and library modules at the file level proves very useful within a large project. Especially if you are new to the project, you can easily and quickly locate the main entry points of the application.
This is not yet another new convention (standard); this is already the convention in at least one other project outside of eXist.
It is backward compatible with various approaches that have been adopted by eXist community members in the past.
This book is here to help you get your job done. In general, you may use the code in this book in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing a CD-ROM of examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission.
We appreciate, but do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “eXist by Erik Siegel and Adam Retter (O’Reilly). Copyright 2015 Erik Siegel and Adam Retter, 978-1-449-33710-0.”
If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at email@example.com.
Many of the code examples provided in the book and example programs that are discussed in the book are publicly available from GitHub at https://github.com/eXist-book, where we currently provide two repositories:
This encompasses all of the code that accompanies the book (i.e., XQuery, XSL-FO, XSLT, XForms, XML, Java, and Python), except for the examples discussed in Chapter 3.
For convenience, build scripts are included so that the majority of examples can be compiled into an EXPath Package file (see “Packaging”) that can be easily deployed into eXist, and the Java projects can be compiled into JAR files for use with eXist or from the command line.
This is provided as a reference for the tutorials set out in Chapter 3. It is deliberately kept separate from the other code examples, as we felt that you would benefit more from following the tutorials and entering the code manually while considering each line of code that you are writing.
This repository is structured as an eXist backup. To restore the backup, see “Backup and Restore”.
With either of our two GitHub repositories, to get a copy of the source code you need to ideally have Git installed. If you do not wish to install Git, it is also possible from the GitHub repositories to download a ZIP or compressed TAR file of the source code. However, using Git is recommended, as it will allow you to easily update the source code in the future, should we make any corrections or additions.
Assuming that you have Git installed (if you are on a Windows platform, we will assume that you are using Git Shell), from your Unix/Linux/Mac terminal (or your Windows Git Shell), you can run the following to clone (make a copy of) our repositories:
$git clone https://github.com/eXist-book/book-code
$git clone https://github.com/eXist-book/using-exist-101
You now have a clone of each repository. In the future, should you wish to pull in any updates we have made, you can simply run:
Now let’s look at how you build and deploy the code from the book-code repository.
The book-code repository contains the following top-level folders:
This folder contains the build configuration that is inherited by each project.
This folder contains the build configuration that is inherited by each of the Java projects.
This folder contains subfolders for each chapter of the book where example code is provided.
This folder contains the build configuration for building an EXPath package.
We use the Apache Maven build tool for compiling all of the projects that accompany the book. Therefore, to make the most of the example code that goes along with the book, you will also need to download and install Maven. Maven, like eXist, requires Java; if you do not already have Java installed you can download either Java 6 or 7 from http://java.oracle.com. Each pom.xml file that you see in the code is a Maven project file that describes how to build the code and resolves any dependencies that are required.
If you wish to build all of the code projects that accompany the book in one step, you can simply run the following commands from your terminal (or Git Shell on Windows):
If you wish to build just the EXPath package of the example XQuery, XSLT, XForms,
and XML code that accompanies the book, you can simply enter the
xml-examples-xar subfolder and run
mvn package. To achieve this, we have used the
excellent EXPath package Maven plug-in written by Claudius
Teodorescu, which allows us to easily create a XAR file from a
manifest (see the file xml-examples-xar/expath-pkg.assembly.xml)
that describes the EXPath package.
The result of the Maven build process is the file exist-book-1.0.xar in the target sub-folder of xml-examples-xar. You can then deploy the package by either copying it to $EXIST_HOME/autodeploy, or using the dashboard app as follows:
Open up the eXist dashboard in your web browser, log in as
admin, and click on the Package
Click on the upload application icon (in the top left of the screen; it looks like a stack of disks).
Browse to and select the exist-book-1.0.xar file and press thebutton.
After installation, the sample code is available as another tile in the dashboard. It runs as a simple application that allows you quick access to running the examples.
The Java examples that accompany the book will also be built if
you build everything, and the resultant artifacts will be placed into
the target subfolders of each
project. Each Java project example is discussed in detail in the
relevant chapter later in the book. You can also compile the Java
projects individually by running
mvn package in the
folder of each Java project. For example, if you wanted to build just
the REST Server client examples, you would run:
Each Java example is designed to both educate and potentially
serve as a skeleton for your own Java projects. By simply changing the
artifactId of the project’s pom.xml file and including any additional
required dependencies, you have a very quick mechanism to start
building your own projects.
It is also worth mentioning that a ZIP or fat JAR file assembly is also created for many of the Java project examples, and this can be found in the appropriate target subfolder. A fat JAR file assembly is simply a JAR file that also contains all of the dependencies of the project, to allow you to have a single file artifact. So, for example, when you are compiling the restserver-client examples, the following assemblies are created:
Technology professionals, software developers, web designers, and business and creative professionals use Safari Books Online as their primary resource for research, problem solving, learning, and certification training.
Members have access to thousands of books, training videos, and prepublication manuscripts in one fully searchable database from publishers like O’Reilly Media, Prentice Hall Professional, Addison-Wesley Professional, Microsoft Press, Sams, Que, Peachpit Press, Focal Press, Cisco Press, John Wiley & Sons, Syngress, Morgan Kaufmann, IBM Redbooks, Packt, Adobe Press, FT Press, Apress, Manning, New Riders, McGraw-Hill, Jones & Bartlett, Course Technology, and hundreds more. For more information about Safari Books Online, please visit us online.
Please address comments and questions concerning this book to the publisher:
We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at http://bit.ly/eXist.
To comment or ask technical questions about this book, send email to: firstname.lastname@example.org
For more information about our books, courses, conferences, and news, see our website at http://www.oreilly.com.
Find us on Facebook: http://facebook.com/oreilly
Follow us on Twitter: http://twitter.com/oreillymedia
Watch us on YouTube: http://www.youtube.com/oreillymedia
The authors would like to especially thank Dan McCreary, Chris Wallace, and Dannes Wessels for reviewing their work.
In addition, they would like to offer their thanks to Ron Van den Branden, Martin Holmes, Casey Jordan, Kurt Cagle, Paul Kelly, Tobi Krebs, Brois Lehečka, Wolfgang Meier, Chris Misztur, Dave Pawson, Jens Østergaard Petersen, Phill Ramey, Dmitriy Shabanov, Luis Tavera, Claudius Teodorescu, Chris Tomlinson, Joern Turner, David Voňka, Priscilla Walmsley, Michael Westbay, Joe Wicentowski, and Lars Windauer for their support and feedback.