Symbolic Links vs Hard Links

By | 2019-10-06

On Linux and UNIX systems, there are two types of links. Both types are created with the ln command. They function similarly but have some differences.

First, you need to understand a little bit about how filesystems work. Filesystems store meta data such as permissions, modification times, and the location of the file’s data on the underlying storage device in a structure called an inode. A directory is a special type of file that associates names with inodes. When commands such as ls -l display things like permissions, they first read the directory file to get a list of files and inode numbers, and then look up the relevant data in the inode.

Hard Links

A hard link is another name for the same file. When you create a hard link, you are creating a directory entry associating a name with an inode that already as an association with another name elsewhere. I.e. when you create a hard link, you are simply giving an existing file an additional name.

Since a hard link is just a directory entry pointing to an existing file, you cannot create a hard link to a file on another filesystem. Since each filesystem has its own inode numbers, collisions can occur. Perhaps an example can clarify this.

Filename Filesystem inode
/boot/kernel /boot 500
/etc/hosts / 500

Notice how both files have the same inode number. If you were to create a new hard link in /boot to /etc/hosts, it would be to inode 500, which already exists in /boot. It would point to the wrong file.

Symbolic Links

You will sometimes hear and see symbolic links referred to as symlinks or soft links. All three terms mean the same thing.

A symbolic link is a special file with its own inode that stores the path to another file as its data. It is a separate file with the path to another file as its data. Unlike a hard link, a symlink can reference a file on another filesystem.

When using symbolic links, some programs differ in how they handle them. Most programs treat the link as the file the link points to. Some see the link as a file to act on and not the file it points to. Some have options allowing the user to specify how to handle them. For example, the tar program, which is commonly used to backup or archive files, has an option to dereference symlinks. I.e. by default tar will add the link itself into an archive instead of the file it points to. Using the dereference option causes it to use the file the link points to instead.

Examples

Here is an example of creating a soft link vs a hard link:

$ ls -li
total 4
28704901 -rw-r--r-- 1 root root 2 Oct  6 08:43 a
$ ln -s a a_soft
$ ln  a a_hard
$ ls -li
total 8
28704901 -rw-r--r-- 2 root root 2 Oct  6 08:43 a
28704901 -rw-r--r-- 2 root root 2 Oct  6 08:43 a_hard
28704900 lrwxrwxrwx 1 root root 1 Oct  6 08:43 a_soft -> a

The far left column of the output from ls -li is the inode number. Notice that when I created a soft link, a new inode was created. When I created a hard link, the link and the new file have the same inode number.

$ ln /boot/grub2/grub.cfg /root/grub.cfg
ln: failed to create hard link '/root/grub.cfg' => '/boot/grub2/grub.cfg': Invalid cross-device link

The ln program will output an error if you attempt to create a hard link to a file on a different filesystem.

$ cp a_soft b
$ ls -l
total 12
-rw-r--r-- 2 root root 2 Oct  6 08:43 a
-rw-r--r-- 2 root root 2 Oct  6 08:43 a_hard
lrwxrwxrwx 1 root root 1 Oct  6 08:43 a_soft -> a
-rw-r--r-- 1 root root 2 Oct  6 08:52 b
$ tar -cf c.tar a a_soft
$ tar -hcf d.tar a a_soft
$ ls -l
total 48
-rw-r--r-- 2 root root     2 Oct  6 08:43 a
-rw-r--r-- 2 root root     2 Oct  6 08:43 a_hard
lrwxrwxrwx 1 root root     1 Oct  6 08:43 a_soft -> a
-rw-r--r-- 1 root root     2 Oct  6 08:52 b
-rw-r--r-- 1 root root 10240 Oct  6 08:53 c.tar
-rw-r--r-- 1 root root 10240 Oct  6 08:53 d.tar
$ rm a a_hard a_soft b
$ tar xf c.tar
$ ls -l
total 28
-rw-r--r-- 1 root root     2 Oct  6 08:43 a
lrwxrwxrwx 1 root root     1 Oct  6 08:43 a_soft -> a
-rw-r--r-- 1 root root 10240 Oct  6 08:53 c.tar
-rw-r--r-- 1 root root 10240 Oct  6 08:53 d.tar
$ rm a a_soft 
$ tar xf d.tar
$ ls -l
total 32
-rw-r--r-- 2 root root     2 Oct  6 08:43 a
-rw-r--r-- 2 root root     2 Oct  6 08:43 a_soft
-rw-r--r-- 1 root root 10240 Oct  6 08:53 c.tar
-rw-r--r-- 1 root root 10240 Oct  6 08:53 d.tar

The last example demonstrates the difference in how programs handle soft links. Notice how cp made a copy of the file the link points to instead of the link. By default, the tar command works on the link file itself instead of the file the link points to. When the -h option is added, tar works on the file the link points to instead.

References