7.23. Extract the Filename from a Windows Path
Problem
You have a string that holds a (syntactically) valid path to a
file or folder on a Windows PC or network, and you want to extract the
filename, if any, from the path. For example, you want to extract file.ext
from c:\folder\file.ext
.
Solution
[^\\/:*?"<>|\r\n]+$
Regex options: Case insensitive |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Discussion
Extracting the filename from a string known to hold a valid path is trivial, even if you don’t know whether the path actually ends with a filename.
The filename always occurs at the end of the string. It can’t contain any colons or backslashes, so it cannot be confused with folders, drive letters, or network shares, which all use backslashes and/or colons.
The anchor ‹$
›
matches at the end of the string (Recipe 2.5). The fact that the dollar also matches at embedded line
breaks in Ruby doesn’t matter, because valid Windows paths don’t
include line breaks. The negated character class ‹[^\\/:*?"<>|\r\n]+
› (Recipe 2.3) matches the characters that can occur
in filenames. Though the regex engine scans the string from left to
right, the anchor at the end of the regex makes sure that only the last run of filename
characters in the string will be matched, giving us our
filename.
If the string ends with a backslash, as it will for paths that don’t specify a filename, the regex won’t match at all. When it does match, it will match only the filename, so we don’t need to use any capturing ...
Get Regular Expressions Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.