12
File Input and Output
Getting data into and out of files is one of the fundamental necessities of a computing
platform. There are many facilities for doing so in both R and MATLAB.
For reading text data from files, R has the functions read.table, scan, and readLines,
while MATLAB has load, fgetl, fscanf, textread, textscan, and importdata. Often,
more than one of these functions can be used to accomplish a given task. Which one is
best is partly personal preference, and may also depend on the type of variable you want to
store the data in. The following e xamples of some common tasks may help to suggest some
possible approa ches. Note that these commands will all read and write files in the current
working directory (see Section 7.1), unless you specify a path to a file in another directory.
12.1 Opening files
Both R and MATLAB have various functions which will operate on files via what are called
file descriptors.
1
A file descriptor is a “handle” which refers to an open file, and generally
appears to the user as an integer. In MATLAB, as in the C programming lang uage, the
file des criptors 0, 1, and 2 are generally reserved for standard input, standard output,
and standard error, the standard input/output channels for programs. In R, these three
connections are accessed via the function calls stdin(), stdout(), and stderr().
When you open a file and create a file descriptor, you can specify whether you want
to open the file for reading, writing, appending, or a combination of reading with writ-
ing/appending. This is done by providing a permissions mode string. The string can be
“r,” “w,” or “a” for reading , writing, and appending, respectively. Using “r+,” “w+,” and
“a+” specifies reading and writing witho ut overwrite, reading and writing with overwrite
(discarding the existing contents of the file), a nd reading and appending.
In R, you can open a file for reading via the comma nd fid = file('filename', 'r').
You can test for success by calling isOp en(fid), which returns TRUE if the file was
successfully opened. In MATLAB, you can us e fid = f open('filename', 'r'). It will
return a positive integer upon success, and -1 on failure. In MATLAB, the 'r' string is
optional, as that is the default.
2
Also in both platforms, be aware that if you open a file
for wr iting (using e ither 'w' or 'w+'), the file will be overwritten, i.e., any existing data
in them will be erased without warning! When you are done accessing a file, you should
1
R uses what are called connections rather than simple file descriptors, but for most practical uses, they
are equivalent.
2
Technically the 'r' string is optional in R as well. But note that if you omit it, actually opening the file
will be deferred until you try to access the file (i.e., read from it). It will be closed again immediately after
reading, and the current position for reading wil l be reset. That is, fid=file(’foo.txt’); isOpen(fid);
x1=scan(fid,n=1); x2=scan(fid,n=1) will show that fid is not op en, and will then read the firs t value
from the file twice, storing it in x1 and x2. Including the 'r' in the call to file will cause it to read the first
two values on the consecutive calls to scan.
147

Get R and MATLAB now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.