Name Mangling and Case

Back in the days of DOS and Windows 3.1, every filename was limited to eight upper-case characters, followed by a dot, and three more uppercase characters. This was known as the 8.3 format , and was a huge nuisance. Windows 95/98, Windows NT, and Unix have since relaxed this problem by allowing many more case-sensitive characters to make up a filename. Table 5.6 shows the current naming state of several popular operating systems.

Table 5-6. Operating System Filename Limitations

Operating System

File Naming Rules

DOS 6.22 or below

Eight characters followed by a dot followed by a three-letter extension (8.3 format); case insensitive

Windows 3.1 for Workgroups

Eight characters followed by a dot followed by a three-letter extension (8.3 format); case insensitive

Windows 95/98

127 characters; case sensitive

Windows NT

127 characters; case sensitive

Unix

255 characters; case sensitive

Samba still has to remain backwards compatible with network clients who store files only in the 8.3 format, such as Windows for Workgroups. If a user creates a file on a share called antidisestablishmentarianism.txt, a Windows for Workgroups client couldn’t tell it apart from another file in the same directory called antidisease.txt. Like Windows 95/98 and Windows NT, Samba has to employ a special methodology of translating a long filename to an 8.3 filename in such a way that similar filenames will not cause collisions. This is called name mangling, and Samba deals with this in a manner that is similar, but not identical to, Windows 95 and its successors.

The Samba Mangling Operation

Here is how Samba mangles a long filename into an 8.3 filename:

  • If the original filename does not begin with a dot, up to the first five alphanumeric characters that occur before the last dot (if there is one) are converted to uppercase. These characters are used as the first five characters of the 8.3 mangled filename.

  • If the original filename begins with a dot, the dot is removed and up to the first five alphanumeric characters that occur before the last dot (if there is one) are converted to uppercase. These characters are used as the first five characters of the 8.3 mangled filename.

  • These characters are immediately followed a special mangling character: by default, a tilde (~), although Samba allows you to change this character.

  • The base of the long filename before the last period is hashed into a two-character code; parts of the name after the last dot may be used if necessary. This two character code is appended to the 8.3 filename after the mangling character.

  • The first three characters after the last dot (if there is one) of the original filename are converted to uppercase and appended onto the mangled name as the extension. If the original filename began with a dot, three underscores ( _ _ _ ) are used as the extension instead.

Here are some examples:

virtuosity.dat                       VIRTU~F1.DAT
.htaccess                            HTACC~U0._ _ _
hello.java                           HELLO~1F.JAV
team.config.txt                      TEAMC~04.TXT
antidisestablishmentarianism.txt     ANTID~E3.TXT
antidiseast.txt                      ANTID~9K.TXT

Using these rules will allow Windows for Workgroups to differentiate the two files on behalf of the poor individual who is forced to see the network through the eyes of that operating system. Note that the same long filename should always hash to the same mangled name with Samba; this doesn’t always happen with Windows. The downside of this approach is that there can still be collisions; however, the chances are greatly reduced.

You generally want to use the mangling configuration options with only the oldest clients. We recommend doing this without disrupting other clients by adding an include directive to the smb.conf file:

[global]
	include = /ucsr/local/samba/lib/smb.conf.%m

This resolves to smb.conf.WfWg when a Window for Workgroups client attaches. Now you can create a file /usr/local/samba/lib/smb.conf.WfWg which might contain these options:

[global]
	case sensitive = no
	default case = upper
	preserve case = no
	short preserve case = no
	mangle case = yes
	mangled names= yes

If you are not using Windows for Workgroups 3.1, then you probably do not need to change any of these options from their defaults.

Representing and resolving filenames with Samba

Another item that we should point out is that there is a difference between how an operating system represents a file and how it resolves it. For example, if you’ve used Windows 95/98/NT, you have likely run across a file called README.TXT. The file can be represented by the operating system entirely in uppercase letters. However, if you open an MS-DOS prompt and enter the command edit readme.txt, the all-caps file is loaded into the editing program, even though you typed the name in lowercase letters!

This is because the Windows 95/98/NT family of operating systems resolves files in a case-insensitive manner, even though the files are represented it in a case-sensitive manner. Unix-based operating systems, on the other hand, always resolve files in a case-sensitive manner; if you try to edit README.TXT with the command vi readme.txt, you will likely be editing the empty buffer of a new file.

Here is how Samba handles case: if the preserve case is set to yes, Samba will always use the case provided by the operating system for representing (not resolving) filenames. If it is set to no, it will use the case specified by the default case option. The same is true for short preserve case. If this option is set to yes, Samba will use the default case of the operating system for representing 8.3 filenames; otherwise it will use the case specified by the default case option. Finally, Samba will always resolve filenames in its shares based on the value of the case sensitive option.

Mangling Options

Samba allows you to give it more refined instructions on how it should perform name mangling, including those controlling the case sensitivity, the character inserted to form a mangled name, and the ability to manually map filenames from one format to another. These options are shown in Table 5.7.

Table 5-7. Name Mangling Options

Option

Parameters

Function

Default

Scope

case sensitive

(casesignames)

boolean

If yes, Samba will treat filenames as case-sensitive (Windows doesn’t).

no

Share

default case

(upper or lower)

Case to assume as default (only used when preserve case is no).

Lower

Share

preserve case

boolean

If yes, keep the case the client supplied (i.e., do not convert to default case).

yes

Share

short preserve case

boolean

If yes, preserve case of 8.3-format names that the client provides.

yes

Share

mangle case

boolean

Mangle a name if it is mixed case.

no

Share

mangled names

boolean

8.3 DOS format.

yes

Share

mangling char

string (single character)

Gives mangling character.

~

Share

mangled stack

numerical

Number of mangled names to keep on the local mangling stack.

50

Global

mangled map

string (list of patterns)

Allows mapping of filenames from one format into another.

None

Share

case sensitive

This share-level option, which has the obtuse synonym casesignames, specifies whether Samba should preserve case when resolving filenames in a specific share. The default value for this option is no, which is how Windows handles file resolution. If clients are using an operating system that takes advantage of case-sensitive filenames, you can set this configuration option to yes as shown here:

[accounting]
	case sensitive = yes

Otherwise, we recommend that you leave this option set to its default.

default case

The default case option is used with preserve case. This specifies the default case (upper or lower) that Samba will use when it creates a file on one of its shares on behalf of a client. The default case is lower, which means that newly created files will use the mixed-case names given to them by the client. If you need to, you can override this global option by specifying the following:

[global]
	default case = upper

If you specify this value, the names of newly created files will be translated into uppercase, and cannot be overridden in a program. We recommend that you use the default value unless you are dealing with a Windows for Workgroups or other 8.3 client, in which case it should be upper.

preserve case

This option specifies whether a file created by Samba on behalf of the client is created with the case provided by the client operating system, or the case specified by the default case configuration option above. The default value is yes, which uses the case provided by the client operating system. If it is set to no, the value of the default case option is used.

Note that this option does not handle 8.3 file requests sent from the client—see the short preserve case option below. You may want to set this option to yes if applications that create files on the Samba server are sensitive to the case used when creating the file. If you want to force Samba, for example, to mimic the behavior of a Windows NT filesystem, you can leave this option to its default, yes.

short preserve case

This option specifies whether an 8.3 filename created by Samba on behalf of the client is created with the default case of the client operating system, or the case specified by the default case configuration option. The default value is yes, which uses the case provided by the client operating system. You can let Samba choose the case through the default case option by setting it as follows:

[global]
	short preserve case = no

If you want to force Samba to mimic the behavior of a Windows NT filesystem, you can leave this option set to its default, yes.

mangled names

This share-level option specifies whether Samba will mangle filenames for 8.3 clients in that share. If the option is set to no, Samba will not mangle the names and (depending on the client), they will either be invisible or appear truncated to those using 8.3 operating systems. The default value is yes. You can override it per share as follows:

[data]
	mangled names = no

mangle case

This option tells Samba whether it should mangle filenames that are not composed entirely of the case specified using the default case configuration option. The default for this option is no. If you set it to yes, you should be sure that all clients will be able to handle the mangled filenames that result. You can override it per share as follows:

[data]
	mangle case = yes

We recommend that you leave this option alone unless you have a well-justified need to change it.

mangling char

This share-level option specifies the mangling character used when Samba mangles filenames into the 8.3 format. The default character used is a tilde (~). You can reset it to whatever character you wish, for instance:

[data]
	mangling char = #

mangled stack

Samba maintains a local stack of recently mangled 8.3 filenames; this stack can be used to reverse map mangled filenames back to their original state. This is often needed by applications that create and save a file, close it, and need to modify it later. The default number of long filename/mangled filename pairs stored on this stack is 50. However, if you want to cut down on the amount of processor time used to mangle filenames, you can increase the size of the stack to whatever you wish, at the expense of memory and slightly slower file access.

[global]
	mangled stack = 100

mangled map

If the default behavior of name mangling is not sufficient, you can give Samba further instructions on how to behave using the mangled map option. This option allows you to specify mapping patterns that can be used before or even in place of name mangling performed by Samba. For example:

[data]
	mangled map =(*.database *.db) (*.class *.cls)

Here, Samba is instructed to search each file it encounters for characters that match the first pattern specified in the parenthesis and convert them to the modified second pattern in the parenthesis for display on an 8.3 client. This is useful in the event that name mangling converts the filename incorrectly or to a format that the client cannot understand readily. Patterns are separated by whitespaces.

Get Using Samba now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.