Producing temporary files is a common need in many applications. Whether you need to store something on disk to keep it out of memory until it is needed again, or you want to serve up a file but don’t need to keep it lurking around after your process has terminated, odds are you’ll run into this problem sooner or later.
It’s quite tempting to roll our own Tempfile
support, which might look something like the following code:
File.open("/tmp/foo.txt","w") do |file| file << some_data end # Then in some later code File.foreach("/tmp/foo.txt") do |line| # do something with data end # Then finally require "fileutils" FileUtils.rm("/tmp/foo.txt")
This code works, but it has some drawbacks. The first is that it assumes that you’re on a *nix system with a /tmp directory. Secondly, we don’t do anything to avoid file collisions, so if another application is using /tmp/foo.txt, this will overwrite it. Finally, we need to explicitly remove the file, or risk leaving a bunch of trash around.
Luckily, Ruby has a standard library that helps us get around these issues. Using it, our example then looks like this:
require "tempfile" temp = Tempfile.new("foo.txt") temp << some_data # then in some later code temp.rewind temp.each do |line| # do something with data end # Then finally temp.close
Let’s take a look at what’s going on in a little more detail, to really get a sense of what the tempfile library is doing for us.
The code looks somewhat similar to our original example, as we’re
still essentially working with an IO
object. However,
the approach is different. Tempfile
opens up a file
handle for us to a file that is stored in whatever your system’s
tempdir is. We can inspect this value, and even
change it if we need to. Here’s what it looks like on two of my
systems:
>> Dir.tmpdir => "/var/folders/yH/yHvUeP-oFYamIyTmRPPoKE+++TI/-Tmp-" >> Dir.tmpdir => "/tmp"
Usually, it’s best to go with whatever this value is, because it is where Ruby thinks your temp files should go. However, in the cases where we want to control this ourselves, it is simple to do so, as shown in the following:
temp = Tempfile.new("foo.txt", "path/to/my/tmpdir")
When you create a temporary file with
Tempfile.new
, you aren’t actually specifying an exact
filename. Instead, the filename you specify is used as a base name that
gets a unique identifier appended to it. This prevents one temp file
from accidentally overwriting another. Here’s a trivial example that
shows what’s going on under the hood:
>> a = Tempfile.new("foo.txt") => #<File:/tmp/foo.txt.2021.0> >> b = Tempfile.new("foo.txt") => #<File:/tmp/foo.txt.2021.1> >> a.path => "/tmp/foo.txt.2021.0" >> b.path => "/tmp/foo.txt.2021.1"
Allowing Ruby to handle collision avoidance is generally a good thing, especially if you don’t normally care about the exact names of your temp files. Of course, we can always rename the file if we need to store it somewhere permanently.
Because we’re dealing with an object that delegates most of its
functionality directly to File
, we
can use normal File
methods, as shown
in our example. For this reason, we can write to our file handle as
expected:
temp << some_data
and read from it in a similar fashion:
# then in some later code temp.rewind temp.each do |line| # do something with data end
Because we leave the file handle open, we need to rewind it to
point to the beginning of the file rather than the end. Beyond that, the
behavior is exactly the same as File#each
.
Tempfile
cleans up after itself. There are two
main ways of unlinking a file; which one is correct depends on your
needs. Simply closing the file handle is good enough, and it is what we
use in our example:
temp.close
In this case, Ruby doesn’t remove the temporary file right away.
Instead, it will keep it around until all references to
temp
have been garbage-collected. For this reason, if
keeping lots of open file handles
around is a problem for you, you can actually close your handles without
fear of losing your temp file, as long as you keep a reference to it
handy.
However, in other situations, you may want to purge the file as soon as it has been closed. The change to make this happen is trivial:
temp.close!
Finally, if you need to explicitly delete a file that has already been closed, you can just use the following:
temp.unlink
In practice, you don’t need to think about this in most cases. Instead, tempfile works as you might expect, keeping your files around while you need them and cleaning up after itself when it needs to. If you forget to close a temporary file explicitly, it’ll be unlinked when the process exits. For these reasons, using the tempfile library is often a better choice than rolling your own solution.
There is more to be said about this very cool library, but what we’ve already discussed covers most of what you’ll need day to day, so now is a fine time to go over what’s been said and move on to the next thing.
We’ve gone over some of the tools Ruby provides for working with your filesystem in a platform-agnostic way, and we’re about to get into some more advanced strategies for managing, processing, and manipulating your files and their contents. However, before we do that, let’s review the key points about working with your filesystem and with temp files:
There are a whole slew of options for file management in Ruby, including
FileUtils
,Dir
, andPathname
, with some overlap between them.Pathname
provides a high-level, modern Ruby interface to managing files and traversing your filesystem.FileUtils
provides a *nix-style API to file management tools, but works just fine on any system, making it quite useful for porting shell scripts to Ruby.The tempfile standard library provides a convenient IO-like class for dealing with temp files in a system-independent way.
The tempfile library also helps make things easier through things like name collision avoidance, automatic file unlinking, and other niceties.
With these things in mind, we’ll see more of the techniques shown in this section later on in the chapter. But if you’re bored with the basics, now is the time to look at higher-level strategies for doing common I/O tasks.
Get Ruby Best Practices now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.