The Unix File Structure

The hierarchical file structure

Like the Microsoft Windows file structure, the Unix file structure is arranged in a hierarchical structure, like an inverted tree. The figure below shows a portion of the file structure for the CSCI Department's Unix machines. The top-level directory of the hierarchy is traditionally called root (written as a slash / ). The tree "grows" downward from the root, with paths connecting the root to each of the other files. At the end of each path is an ordinary file or a directory file. Ordinary files, fequently just called files, are at the ends of paths that cannot support other paths. In the figure below, they are represented as rectangular boxes. Directory files, usually referred to as directories, are the points that other paths can branch off from. In the figure blow they are represented as boxes with rounded corners. Some directories in the figure contain files, some contain subdirectories, some are empty. When you refer to the tree, up is toward the root, and down is away from the root. Directories directly connected by a path are called parents (closer to the root) and children (farther from the root).

Unix File Structure

In a standard Unix system, each user starts with one directory, called their home directory. This directory has the same name as your logon id (which is your z-id if you are a student). For example, in the figure above the home directory of the hypothetical user z123456 is shown highlighted in yellow. From this single directory, you can make as many subdirectories as you like, dividing subdirectories into additional subdirectories. You can create files in your home directory or any of its subdirectories. The major limit to the number of files and subdirectories you can create is the amount of disk space that you have been allocated (which is relatively small).

Unlike Windows, Unix does not normally use drive specifiers like C: or E:. Instead, file systems on different disk drives can be mounted so that for most purposes they appear to be part of a single hierarchical file structure. In fact, with the right networking tools, these file systems may even be located on different computers! So even though hopper and turing are different machines with their own disk drives, in the figure above they appear to be subdirectories of a single common file structure.

Filenames

Every file has a filename. You can create files with names up to 255 characters long. Although you can use almost any character in a filename, you will avoid problems if you choose characters from the following list:

uppercase letters (A-Z)
lowercase letters (a-z)
numbers (0-9)
underscore (_)
period (.)
comma (,)

No two files in the same directory can have the same name. Files in different directories can have the same name.

The Unix operating system is case-sensitive, so files named JANUARY, January, and january would represent three distinct files.

Filename extensions

Filename extensions can be used to help describe the contents of a file. For example, the extensions .cc or .cpp are used for C++ source code files, the extension .o is used for an object code file, and so forth.

Unlike Microsoft Windows, in most cases Unix file extensions are optional. Text files don't need to end with the extension .txt, executable files don't need to have the extension .exe, etc. Unix also doesn't make any real distinction between the names of ordinary files and the names of directory files.

Hidden filenames

A filename that begins with a period is called a hidden filename because the ls command does not normally display it. The command ls -a displays all filenames, even hidden ones. Many Unix programs use hidden filenames for their configuration files.

Absolute Pathnames

Every file has an pathname. The following figure shows the pathnames of directories and ordinary files in part of a Unix file structure.

Absolute Pathnames

You can build the absolute pathname of a file by tracing a path from the root directory, through all the intermediate directories, to the file. String all the filenames in the path together, separating them with slashes (/) and preceding them with the name of the root directory (/).

This path of filenames is called an absolute pathname because it locates a file absolutely, tracing a path from the root directory to the file. No two files in the Unix file structure may have the same absolute pathname. The part of a pathname following the final slash we will refer to as the simple filename, or just a filename.

The tilde abbreviation

Absolute pathnames can be quite long, so the version of Unix used in our department provides a means of abbreviating them. The tilde character ~ can be used as an abbreviation for the absolute pathname of your home directory. The expression ~loginid can be used as an abbreviation for another user's home directory, where logonid is the user's logon id.

For example, let's say that you are logged in as user z123456. Here are some possible uses of the tilde abbreviation:

Typing the pathname...	...is equivalent to typing the absolute pathname
`~`	`/home/turing/z123456` (i.e., your home directory)
`~/CS241/p1.cpp`	`/home/turing/z123456/CS241/p1.cpp`
`~t90kjm1`	`/home/turing/t90kjm1`
`~t90kjm1/CS241/Data/Spring2020/employees`	`/home/turing/t90kjm1/CS241/Data/Spring2020/employees`

Directories

This section explains the concepts of the working and home directories and their importance in relative pathnames.

The working directory

While you are logged in on a Unix system, you will always be associated with one directory or another. The directory you are associated with, or are working in, is called the working directory, or the current directory. The pwd utility displays the absolute pathname of the working directory. An abbreviated form of this pathname is also displayed as part of the Unix shell prompt on hopper and turing.

Your home directory

When you first log in on a Unix system, the working directory is your home directory. You own this directory and any subdirectories or files created within or below it.

Directory commands

Unix has a number of commands related to manipulating directories. Some of them are described later in this tutorial and are summarized here.

Command	Use
pwd	Print the absolute pathname of the working directory
cd	Change to a different working directory
mkdir	Make a new directory
rmdir	Remove (delete) an empty directory
ls	List the contents of a directory

The . and .. directory entries

Whenever a new directory is created using the mkdir utility, two entries are automatically placed in it. They are a single and double period, representing the directory itself and the parent directory, respectively. These entries are invisible because their filenames begin with periods.

Because every directory created automatically contains these entries, you can rely on their presence. The . is synonymous with the pathname of the working directory and can be used in its place; .. is synonymous with the pathname of the parent of the working directory.

Relative pathnames

A relative pathname traces a path from the working directory to a file. The pathname is relative to the working directory. Any pathname that does not begin with the root directory (/) or the tilde abbreviation is a relative pathname. Like absolute pathnames, relative pathnames can describe a path through many directories.

The following figure shows the pathnames of various files and directories relative to the working directory /home/turing/z123456/docs (shown highlighted in yellow).

Relative Pathnames

Note that for files and subdirectories located in the working directory, the relative pathname of a file is its simple filename. Alternatively, you can precede the simple filename with ./ (so the relative pathname letter could also be typed as ./letter). Most users don't bother doing so.

Virtually anywhere that a Unix utility program requires a filename or pathname, you can use either an absolute or relative pathname. In practice, experienced users tend to use whichever involves less typing.