The tempfile Standard Library

Producing temporary files is a common need in many applications. Whether you need to store something on disk to keep it out of memory until it is needed again, or you want to serve up a file but don’t need to keep it lurking around after your process has terminated, odds are you’ll run into this problem sooner or later.

It’s quite tempting to roll our own Tempfile support, which might look something like the following code:

File.open("/tmp/foo.txt","w") do |file|
  file << some_data
end

# Then in some later code

File.foreach("/tmp/foo.txt") do |line|
  # do something with data
end

# Then finally
require "fileutils"
FileUtils.rm("/tmp/foo.txt")

This code works, but it has some drawbacks. The first is that it assumes that you’re on a *nix system with a /tmp directory. Secondly, we don’t do anything to avoid file collisions, so if another application is using /tmp/foo.txt, this will overwrite it. Finally, we need to explicitly remove the file, or risk leaving a bunch of trash around.

Luckily, Ruby has a standard library that helps us get around these issues. Using it, our example then looks like this:

require "tempfile"
temp = Tempfile.new("foo.txt")
temp << some_data

# then in some later code
temp.rewind
temp.each do |line|
  # do something with data
end

# Then finally
temp.close

Let’s take a look at what’s going on in a little more detail, to really get a sense of what the tempfile library is doing for us.

Automatic Temporary Directory Handling

The code looks somewhat similar to our original example, as we’re still essentially working with an IO object. However, the approach is different. Tempfile opens up a file handle for us to a file that is stored in whatever your system’s tempdir is. We can inspect this value, and even change it if we need to. Here’s what it looks like on two of my systems:

>> Dir.tmpdir
=> "/var/folders/yH/yHvUeP-oFYamIyTmRPPoKE+++TI/-Tmp-"

>> Dir.tmpdir
=> "/tmp"

Usually, it’s best to go with whatever this value is, because it is where Ruby thinks your temp files should go. However, in the cases where we want to control this ourselves, it is simple to do so, as shown in the following:

temp = Tempfile.new("foo.txt", "path/to/my/tmpdir")

Collision Avoidance

When you create a temporary file with Tempfile.new, you aren’t actually specifying an exact filename. Instead, the filename you specify is used as a base name that gets a unique identifier appended to it. This prevents one temp file from accidentally overwriting another. Here’s a trivial example that shows what’s going on under the hood:

>> a = Tempfile.new("foo.txt")
=> #<File:/tmp/foo.txt.2021.0>
>> b = Tempfile.new("foo.txt")
=> #<File:/tmp/foo.txt.2021.1>
>> a.path
=> "/tmp/foo.txt.2021.0"
>> b.path
=> "/tmp/foo.txt.2021.1"

Allowing Ruby to handle collision avoidance is generally a good thing, especially if you don’t normally care about the exact names of your temp files. Of course, we can always rename the file if we need to store it somewhere permanently.

Same Old I/O Operations

Because we’re dealing with an object that delegates most of its functionality directly to File, we can use normal File methods, as shown in our example. For this reason, we can write to our file handle as expected:

temp << some_data

and read from it in a similar fashion:

# then in some later code
temp.rewind
temp.each do |line|
  # do something with data
end

Because we leave the file handle open, we need to rewind it to point to the beginning of the file rather than the end. Beyond that, the behavior is exactly the same as File#each.

Automatic Unlinking

Tempfile cleans up after itself. There are two main ways of unlinking a file; which one is correct depends on your needs. Simply closing the file handle is good enough, and it is what we use in our example:

temp.close

In this case, Ruby doesn’t remove the temporary file right away. Instead, it will keep it around until all references to temp have been garbage-collected. For this reason, if keeping lots of open file handles around is a problem for you, you can actually close your handles without fear of losing your temp file, as long as you keep a reference to it handy.

However, in other situations, you may want to purge the file as soon as it has been closed. The change to make this happen is trivial:

temp.close!

Finally, if you need to explicitly delete a file that has already been closed, you can just use the following:

temp.unlink

In practice, you don’t need to think about this in most cases. Instead, tempfile works as you might expect, keeping your files around while you need them and cleaning up after itself when it needs to. If you forget to close a temporary file explicitly, it’ll be unlinked when the process exits. For these reasons, using the tempfile library is often a better choice than rolling your own solution.

There is more to be said about this very cool library, but what we’ve already discussed covers most of what you’ll need day to day, so now is a fine time to go over what’s been said and move on to the next thing.

We’ve gone over some of the tools Ruby provides for working with your filesystem in a platform-agnostic way, and we’re about to get into some more advanced strategies for managing, processing, and manipulating your files and their contents. However, before we do that, let’s review the key points about working with your filesystem and with temp files:

  • There are a whole slew of options for file management in Ruby, including FileUtils, Dir, and Pathname, with some overlap between them.

  • Pathname provides a high-level, modern Ruby interface to managing files and traversing your filesystem.

  • FileUtils provides a *nix-style API to file management tools, but works just fine on any system, making it quite useful for porting shell scripts to Ruby.

  • The tempfile standard library provides a convenient IO-like class for dealing with temp files in a system-independent way.

  • The tempfile library also helps make things easier through things like name collision avoidance, automatic file unlinking, and other niceties.

With these things in mind, we’ll see more of the techniques shown in this section later on in the chapter. But if you’re bored with the basics, now is the time to look at higher-level strategies for doing common I/O tasks.

Get Ruby Best Practices now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.