File Input and Output
Getting data into and out of ﬁles is one of the fundamental necessities of a computing
platform. There are many facilities for doing so in both R and MATLAB.
For reading text data from ﬁles, R has the functions read.table, scan, and readLines,
while MATLAB has load, fgetl, fscanf, textread, textscan, and importdata. Often,
more than one of these functions can be used to accomplish a given task. Which one is
best is partly personal preference, and may also depend on the type of variable you want to
store the data in. The following e xamples of some common tasks may help to suggest some
possible approa ches. Note that these commands will all read and write ﬁles in the current
working directory (see Section 7.1), unless you specify a path to a ﬁle in another directory.
12.1 Opening ﬁles
Both R and MATLAB have various functions which will operate on ﬁles via what are called
A ﬁle descriptor is a “handle” which refers to an open ﬁle, and generally
appears to the user as an integer. In MATLAB, as in the C programming lang uage, the
ﬁle des criptors 0, 1, and 2 are generally reserved for standard input, standard output,
and standard error, the standard input/output channels for programs. In R, these three
connections are accessed via the function calls stdin(), stdout(), and stderr().
When you open a ﬁle and create a ﬁle descriptor, you can specify whether you want
to open the ﬁle for reading, writing, appending, or a combination of reading with writ-
ing/appending. This is done by providing a permissions mode string. The string can be
“r,” “w,” or “a” for reading , writing, and appending, respectively. Using “r+,” “w+,” and
“a+” speciﬁes reading and writing witho ut overwrite, reading and writing with overwrite
(discarding the existing contents of the ﬁle), a nd reading and appending.
In R, you can open a ﬁle for reading via the comma nd fid = file('filename', 'r').
You can test for success by calling isOp en(fid), which returns TRUE if the ﬁle was
successfully opened. In MATLAB, you can us e fid = f open('filename', 'r'). It will
return a positive integer upon success, and -1 on failure. In MATLAB, the 'r' string is
optional, as that is the default.
Also in both platforms, be aware that if you open a ﬁle
for wr iting (using e ither 'w' or 'w+'), the ﬁle will be overwritten, i.e., any existing data
in them will be erased without warning! When you are done accessing a ﬁle, you should
R uses what are called connections rather than simple ﬁle descriptors, but for most practical uses, they
Technically the 'r' string is optional in R as well. But note that if you omit it, actually opening the ﬁle
will be deferred until you try to access the ﬁle (i.e., read from it). It will be closed again immediately after
reading, and the current position for reading wil l be reset. That is, fid=file(’foo.txt’); isOpen(fid);
x1=scan(fid,n=1); x2=scan(fid,n=1) will show that ﬁd is not op en, and will then read the ﬁrs t value
from the ﬁle twice, storing it in x1 and x2. Including the 'r' in the call to ﬁle will cause it to read the ﬁrst
two values on the consecutive calls to scan.