Beginner’s Guide to find

By | 2019-05-04

The Linux find program is very powerful. It has a small learning curve, but a little time spent learning find now will save you a lot of time later.

The Linux find command isn’t actually part of Linux. It is a program that is part of the GNU findutils project. It can be installed on pretty much any UNIX like operating system, although most non-Linux operating systems ship their own version that complies with the POSIX standard for find. If you are using find on something other than Linux, consult the documentation for your version of find, most likely a man page, if something in this guide doesn’t work for you.

The Basics

With no arguments, find outputs to your terminal the entire directory structure of your current working directory:

tyler@desktop:~/find$ ls
a  b  dir1
tyler@desktop:~/find$ ls dir1
c
tyler@desktop:~/find$ find
.
./b
./a
./dir1
./dir1/c

If you want find to look in a specific location(s), specify them as the first non-option argument(s):

tyler@desktop:~/find$ pwd
/home/tyler/find
tyler@desktop:~/find$ ls /tmp/find2
a
tyler@desktop:~/find$ find /tmp/find2/
/tmp/find2/
/tmp/find2/a

Searching multiple directories:

tyler@desktop:~/find$ find . /tmp/find2
.
./a
/tmp/find2
/tmp/find2/a

For the remainder of this guide, I will always specify a single directory. That way you get used to seeing it, and therefore familiar with the way find handles command line arguments.

Displaying a directory tree can be useful, but you probably want to narrow your search. You do that with a find expression. A simple example of a find expression is to filter results by name:

tyler@desktop:~/find$ find . -name c
./dir1/c

Searching by Filename

Filename searches can be case sensitive or case sensitive. They also support simple pattern matching. For a case sensitive search, use -name:

tyler@desktop:~/find$ ls dir1/
c  C
tyler@desktop:~/find$ find . -name c
./dir1/c

For case insensitive searches, use use -iname:

tyler@desktop:~/find$ find . -iname c
./dir1/c
./dir1/C

The pattern matching language used with filename searches is just a shell glob. Shell globs are a subject worthy of their own guide, so I am only going to cover the *. When you use * in a pattern, it is interpreted as any number of any character. Since find uses the same pattern matching language as the shell, I recommend that you always surround patterns with or . This will prevent the shell from interpreting the pattern as a glob. A few examples should make pattern matching clear:

tyler@desktop:~/find$ find . -name "*0"
./abc0
./a0
./xyz0
tyler@desktop:~/find$ find . -iname "a*"
./abc0
./abc
./a0
./ABCdEfG123
./ABC
./a
tyler@desktop:~/find$ find . -iname a*
find: paths must precede expression: a0
Try 'find --help' for more information.

In the last example, the shell substituted a* with the names of every file in your current directory that starts with a. It tries to interpret the second filename as part of the search expression. The filename isn’t a valid expression option, hence the error.

Using find to Execute Programs

You can use find to run programs on files that it finds. To do this, you use -exec.

I use find on my log server to compress old files. Here is an example:

tyler@desktop:~/find$ find . -name "*2018.log" -exec gzip {} \;

The first argument after -exec is the program to run. The rest of the arguments, up until \; are arguments passed to the program. This is a requirement. \; indicates to find that the arguments to -exec are finished and anything after that is to be treated as part of the find expression. The {} are substituted with the name of each file found.

In the example, we search for files ending with *2018.log. For each file found, find runs the command gzip with the filename as an argument.

There may be times when you need an extra margin of safety. You can have find prompt before doing anything by using -ok in place of -exec. Other than prompting you, they function the same way. Here is the previous example using -ok instead of -exec:

tyler@desktop:~/find$ find . -name "*2018.log" -ok gzip {} \;
< gzip ... ./fileserver-11-2018.log > ? y
< gzip ... ./ldap-11-2018.log > ? y
< gzip ... ./fileserver-12-2018.log > ? n
< gzip ... ./ldap-10-2018.log > ? y
< gzip ... ./fileserver-10-2018.log > ? y
< gzip ... ./ldap-12-2018.log > ? n
tyler@desktop:~/find$ ls
fileserver-10-2018.log.gz  fileserver-12-2018.log  ldap-11-2018.log.gz
fileserver-11-2018.log.gz  ldap-10-2018.log.gz     ldap-12-2018.log

Searching by File Type

Sometimes you may want to limit your results to a certain type of file. For example, you can limit your search to regular files or directories. The -type option allows you to do this. In the example below, I search for directories.

tyler@desktop:~/find$ find . -type d
.
./application_logs
./application_logs/slapd
./application_logs/haproxy
./application_logs/tomcat
./application_logs/httpd

I mostly search for regular files, which you specify with f. For a complete list of file types you can search for, consult the man page:

tyler@desktop:~/find$ man find

Searching Within Files

I commonly use find to search for strings of text, such as an error message, within source code trees. To search within files, use -exec with something like grep. When I was writing my OpenLDAP OLC Reference, I used this technique to find the source code files that define how OpenLDAP processes its configuration. Here is one of the actual commands I used:

tyler@desktop:/tmp/openldap-2.4.47$ pwd
/tmp/openldap-2.4.47
tyler@desktop:/tmp/openldap-2.4.47$ find . -type f -exec grep -l olcArgsFile {} \;
./servers/slapd/slapd.ldif
./servers/slapd/bconfig.c
./doc/man/man5/slapd-config.5
./tests/data/regressions/its8663/slapd-provider.ldif
./tests/data/regressions/its8800/slapd-provider1.ldif
./tests/data/regressions/its8800/slapd-provider3.ldif
./tests/data/regressions/its8800/slapd-provider4.ldif
./tests/data/regressions/its8800/slapd-provider2.ldif
./tests/data/regressions/its8444/slapd-provider1.ldif
./tests/data/regressions/its8444/slapd-provider3.ldif
./tests/data/regressions/its8444/slapd-provider4.ldif
./tests/data/regressions/its8444/slapd-provider2.ldif
./tests/data/regressions/its8521/slapd-consumer.ldif
./tests/data/regressions/its8521/slapd-provider.ldif
./tests/data/regressions/its8616/slapd-provider.ldif
./tests/data/regressions/its8667/slapd.ldif
./tests/scripts/test064-constraint

In this example, I was looking for regular files using their names as the last argument to the command grep -l olcArgsFile.

If you aren’t familiar with grep, it is used to filter data with a regular expression, which is a pattern matching language. The -l option instructs it to print the name of the file instead of the line the expression matched. In this case, I was looking for the text olcArgsFile.

Since I was looking for C source files, the highlighted file is what I was looking for. The source code of OpenLDAP version 2.4.47 1804 files and directories. find was able to locate the file I was looking for in 2 seconds.

Complex Expressions

It is possible to add logic to find expressions. If you simply specify multiple expressions, they are interpreted as a logical and. For example:

tyler@desktop:~/find$ find . -type f -name "*.log"
./fileserver-12-2018.log
./ldap-12-2018.log

In this example, I am searching for regular files that have names ending with .log.

The find program allows any kind of boolean expression. This means you have the operations: and, not, or, and precedence operators available.

The ! is used as the not operator. The expression below searches for all regular files that do not end with .log:

tyler@desktop:~/find$ find . -type f ! -name "*.log"
./ldap-10-2018.log.gz
./fileserver-11-2018.log.gz
./fileserver-10-2018.log.gz
./openldap-2.4.47.tgz
./ldap-11-2018.log.gz
tyler@desktop:~/find$ ls
application_logs           fileserver-12-2018.log  ldap-12-2018.log
fileserver-10-2018.log.gz  ldap-10-2018.log.gz     openldap-2.4.47.tgz
fileserver-11-2018.log.gz  ldap-11-2018.log.gz

If you wish to specify multiple search criteria with or instead of and, use -o. Below, I search for files that end with .bz2 or .gz.

tyler@desktop:~/find$ find . -name "*.bz2" -o -name "*.gz"
./ldap-10-2018.log.gz
./fileserver-11-2018.log.gz
./fileserver-10-2018.log.gz
./fileserver-09-2018.log.bz2
./ldap-11-2018.log.gz

There is a precedence order when multiple expressions are specified. It is: not, and, then or. As you would expect, you can force a certain precedence by surrounding the applicable part of the expression with parenthesis. Since command line shells interpret them, you must escape them with a \. Here is an example:

tyler@desktop:~/find$ find . ! -type f -name "*.gz"
tyler@desktop:~/find$ find . ! \( -type f -name "*.gz" \)
.
./application_logs
./application_logs/slapd
./application_logs/haproxy
./application_logs/tomcat
./application_logs/httpd
./fileserver-12-2018.log
./openldap-2.4.47.tgz
./ldap-12-2018.log
./fileserver-09-2018.log.bz2

In the first command, I am search for all files that are not regular files ending with .gz. When I specify precedence in the second command, I am searching for all files except regular files ending with .gz.

Summary

You should now be able to use find effectively. I have just scratched the surface of things you can do with it. For instance, find can search by owner, modified time, access time, permissions, size, and more. If you opt to print the output instead of execute a command on each result, you can fine tune the way find prints the results. I suggest taking a bit of time to skim through the man page. This will allow you to familiarize yourself with all of find‘s capabilities.

References