The subject of databases is a large and complex one, spanning many different concepts of structure, form, and expected use. There are also a multitude of different ways to access and manipulate the data stored within these databases.
This book describes and explains an interface called the Perl Database Interface, or DBI, which provides a unified interface for accessing data stored within many of these diverse database systems. The DBI allows you to write Perl code that accesses data without needing to worry about database- or platform-specific issues or proprietary interfaces.
We also take a look at non-DBI ways of storing, retrieving, and manipulating data with Perl, as there are occasions when the use of a database might be considered overkill but some form of structured data storage is required.
To begin, we shall discuss some of the more common uses of database systems in business today and the place that Perl and DBI takes within these frameworks.
In today’s computing climate, databases are everywhere. In previous years, they tended to be used almost exclusively in the realm of mainframe-processing environments. Nowadays, with pizza-box sized machines more powerful than room-sized machines of ten years ago, high-performance database processing is available to anyone.
In addition to cheaper and more powerful computer hardware, smaller database packages have become available, such as Microsoft Access and mSQL. These packages give all computer users the ability to use powerful database technology in their everyday lives.
The corporate workplace has also seen a dramatic decentralization in database resources, with radical downsizing operations in some companies leading to their centralized mainframe database systems being replaced with a mixture of smaller databases distributed across workstations and PCs. The result is that developers and users are often responsible for the administration and maintenance of their own databases and datasets.
This trend towards mixing and matching database technology has some important downsides. Having replaced a centralized database with a cluster of workstations and multiple database types, companies are now faced with hiring skilled administration staff or training their existing administration staff for new skills. In addition, administrators now need to learn how to glue different databases together.
It is in this climate that a new order of software engineering has evolved, namely database-independent programming interfaces. If you thought administration staff had problems with downsizing database technology, developers may have been hit even harder.
A centralized mainframe environment implies that database software is written in a standard language, perhaps COBOL or C, and runs only on one machine. However, a distributed environment may support multiple databases on different operating systems and processors, with each development team choosing their preferred development environment (such as Visual Basic, PowerBuilder, Oracle Pro*C, Informix E/SQL, C++ code with ODBC—the list is almost endless). Therefore, the task of coordinating and porting software has rapidly gone from being relatively straightforward to extremely difficult.
Database-independent programming interfaces help these poor, beleagured developers by giving them a single, unified interface with which they can program. This shields the developer from having to know which database type they are working with, and allows software written for one database type to be ported far more easily to another database. For example, software originally written for a mainframe database will often run with little modification on Oracle databases. Software written for Informix will generally work on Oracle with little modification. And software written for Microsoft Access will usually run with little modification on Sybase databases.
If you couple this database-independent programming interface with a programming language such as Perl, which is operating-system neutral, you are faced with the prospect of having a single code-base once again. This is just like in the old days, but with one major difference—you are now fully harnessing the power of the distributed database environment.
Database-independent programming interfaces help not only development staff. Administrators can also use them to write database-monitoring and administration software quickly and portably, increasing their own efficiency and the efficiency of the systems and databases they are responsible for monitoring. This process can only result in better-tuned systems with higher availability, freeing up the administration staff to proactively maintain the systems they are responsible for.
Another aspect of today’s corporate database lifestyle revolves around the idea of data warehousing , that is, creating and building vast repositories of archived information that can be scanned, or mined, for information separately from online databases. Powerful high-level languages with database-independent programming interfaces (such as Perl) are becoming more prominent in the construction and maintenance of data warehouses. This is due not only to their ability to transfer data from database to database seamlessly, but also to their ability to scan, order, convert, and process this information efficiently.
In summary, databases are becoming more and more prominent in the corporate landscape, and powerful interfaces are required to stop these resources from flying apart and becoming disparate fragments of localized data. This glueing process can be aided by the use of database-independent programming interfaces, such as the DBI, especially when used in conjunction with efficient high-level data-processing languages such as Perl.