Data Source Names

When connecting to a database via the DBI, you need to tell the DBI where to find the database to connect to. For example, the database driver might require a database name, or a physical machine name upon which the database resides. This information is termed a data source name, and of all the aspects of DBI, this is possibly the most difficult to standardize due to the sheer number and diversity of connection syntaxes.

The DBI requires the data source name to start with the characters dbi:, much like a URL begins with http:, and then the name of the driver, followed by another colon—for example, dbi:Oracle:. Any text that follows is passed to the driver’s own connect() method to interpret as it sees fit. Most drivers expect either a simple database name or, more often, a set of one or more name/value pairs separated with semicolons. Some common examples are listed later in this section.

For example, mSQL requires the hostname, database name, and potentially, the TCP/IP port number for connecting to the database server. However, Oracle may require only a single word that is an alias to a more complicated connection identifier that is stored in separate Oracle configuration files.

DBI offers two useful methods for querying which data sources are available to you for each driver you have installed on your system.

Firstly, you can get a list of all the available drivers installed on your machine by using the DBI->available_drivers() method. This returns a list with each element containing the data source prefix of an installed driver,[34] such as dbi:Informix:.

Secondly, you can invoke the DBI->data_sources() method against one or more of the drivers returned by the DBI->available_drivers() method to enumerate which data sources are known to the driver.[35] Calling the data_sources() method will actually load the specified driver and validate that it is completely and correctly installed. Because DBI dies if it can’t load and initialize a driver, this method should be called inside an eval{} block if you need to catch that error.

The following script lists all the drivers and data sources for each driver on your system:

#!/usr/bin/perl -w
#
# ch04/listdsns: Enumerates all data sources and all installed drivers
#
use DBI;

### Probe DBI for the installed drivers
my @drivers = DBI->available_drivers();

die "No drivers found!\n" unless @drivers; # should never happen

### Iterate through the drivers and list the data sources for each one
foreach my $driver ( @drivers ) {
    print "Driver: $driver\n";
    my @dataSources = DBI->data_sources( $driver );
    foreach my $dataSource ( @dataSources ) {
        print "\tData Source is $dataSource\n";
    }
    print "\n";
}

exit;

The output from this script on my machine looks like:

Driver: ADO

Driver: CSV
    Data source is DBI:CSV:f_dir=megaliths
    Data source is DBI:CSV:f_dir=pictish_stones

Driver: ExampleP
    Data Source is dbi:ExampleP:dir=.

Driver: File
    Data Source is DBI:File:f_dir=megaliths
    Data Source is DBI:File:f_dir=pictish_stones

Driver: ODBC

Driver: Proxy

Driver: XBase
    Data Source is dbi:XBase:.

which tells us that we have the standard drivers DBD::Proxy, DBD::ADO, DBD::File, and DBD::ExampleP installed, as well as DBD::ODBC, DBD::XBase, and DBD::CSV.

While this may be interesting in theory, in practice you rarely need to use these methods. Most applications are written to use one data source name, either hardcoded into the application or passed in as a parameter in some way.

When specifying a data source name for a database, the text following the driver prefix should be of the form that is appropriate for the particular database that you wish to connect to. This is very database-specific, but the following table shows some examples.[36]

Database

Example Connection Syntax

mSQL

dbi:mSQL:database:hostname:port_number

For example, to connect to a database called archaeo located on a machine called fowliswester.arcana.co.uk running on port number 1114, the following $data_source argument would be used:

dbi:mSQL:fowliswester.arcana.co.uk:archaeo:1114

Oracle

dbi:Oracle:connection_descriptor

Oracle has a slightly less cut-and-dried way of specifying connection identifiers due to the many different ways in which the Oracle database software can actually handle connections.

To break this nightmarish topic down into bite-sized chunks, Oracle may use two different types of connection. For local connections, Oracle uses a single item of information as the connection descriptor, either the name of the database or an alias to the database as specified in the Oracle configuration files. For a network-based connection, Oracle usually needs to know the alias of the connection descriptor as specified in the Oracle configuration files, or, if you are feeling suitably masochistic, you can specify the whole connection descriptor ... but, believe me, it isn’t pretty.

For example, a simple Oracle $data_source value might be:

dbi:Oracle:archaeo

CSV

dbi:CSV:f_dir=/datafiles

The DBD::CSV module treats a group of comma-separated value files in a common directory as a database. The data source for this driver can contain a parameter f_dir that specifies the directory in which the files are located.

In the case of the $data_source argument, an empty or undefined value will result in the environment variable DBI_DSN being checked for a valid value. If this environment variable is not defined, or does not contain a valid value, the DBI will call die().



[34] The actual definition of ``installed driver'' is a little loose. The DBI simply searches the directories in @INC looking for any DBD subdirectories that contain .pm files. Those are assumed to be drivers. It does not verify that the modules are completely and correctly installed. In practice, this process is fast and works well.

[35] Note that not necessarily every data source that is reachable via the driver is returned. Similarly, the inclusion of a data source does not imply that it is actually currently available for connection.

[36] An excellent example of an application that figures out data source names at runtime is dbish, discussed more fully in Chapter 8.

Get Programming the Perl DBI now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.