Click on our Sponsors to help Support SunWorld
Unix 101 by Mo Budlong

Using find to locate files

Mastering the find command can open the door to some powerful and nifty utilities

SunWorld
June  1997
[Next story]
[Table of Contents]
[Search]
Subscribe to SunWorld, it's free!

Abstract
The man pages on find aren't quite as clear as they could be. This month, we de-obfuscate the powerful find command, explaining how it works and showing you how to fine tune searches by name, file size, date altered, and date created. (2,700 words)


Mail this
article to
a friend

The find command is one of the most powerful tools available to a Unix system administrator, but its command syntax is awkward and often poorly explained. A typically cryptic description of find goes something like this:

find path-list expression
After chasing through the man pages you find that an expression can be either a criteria for selecting a file or an action to perform on a file. It is possible to simplify the find syntax. The find command will search from a starting directory down through subdirectories, locating files that match your specified search criteria. Find will then execute a command on the found file. Though the man pages are technically correct in stating that the find command has only three parts, it is useful to think of it as having four:
1 2 3 4
find starting where find which files do what

In more detail, the parts are:

  1. The find command itself (the word "find"), that is needed to start the program.
  2. The directory from which to start searching. This can be more than one directory, but you will usually use only one starting directory.
  3. Which files to find. The search criteria can be specified by file name, size, type, and many other categories which I will discuss in a moment.
  4. The last part of the command contains what to do with the file when found. There is almost no limit to what you can do to a file. This portion of the command may include any Unix command. When used as a file locator, this part of the command usually specifies that the file path, name, and other information are to be printed on the screen or to a file.


Advertisements

Locating files
The following is an example of a find command that will locate all files named "minutes.txt." The search starts at your own home directory and works its way down through subdirectories. On each `hit' it prints the file name of the file found.

find $HOME -name minutes.txt -print
In this example part 1 of the command is the find command itself. Part 2 is the starting directory, $HOME. Part 3 lists the files to find, and this part of the command is "-name minutes.txt." The "do what" portion of the command is "-print." Here are the three most common starting directories.
$HOME	the users home directory
.   	the current directory
/    	the root directory (searches the whole system)

Using find to locate files by name is probably the most common use for the command. It is so common that there are some quick points here on the use of the -name option with the command. This option is followed by the name you want to locate. The file name that you want to find may contain wildcards as in *.txt to search for all files ending in .txt. When a wild card is included in a file name, the shell attempts to expand these wild cards before giving the arguments to find. You want find to receive the argument exactly as "*.txt" and not as an already expanded list of file names such as notes.txt, mydoc.txt, and so on.

In order to protect the asterisk from being expanded by the shell, it is necessary to use a backslash to escape the asterisk as in:

find $HOME -name \*.txt -print
The backslash in front of the asterisk prevents *.txt from being expanded by the shell. Find receives the argument as "-name *.txt", which is what you wanted in the first place. The same applies to the "?" wildcard. The following will locate all files with a one character extension and will print their names.
find $HOME -name \*.\? -print
Note the escapes in front of both the asterisk and the question mark. This simple syntax allows you to create a file locating utility that can be used to track down lost files. Here I use a shell script named findfile, which can be used as a quick way of entering a file search (see Listing 1). Use vi to create the file and then make it executable by entering:
chmod a+x findfile 
In fact most of this shell script is the error checking logic and usage information. Listing 1 -- findfile
#!/bin/sh
# -----------------------------------------------------
# findfile file locator
#
# syntax:
#    findfile filespec
# -----------------------------------------------------
# The number of command line arguments must be exactly 1. Otherwise
# display a usage message. If the arguments are OK run the find
# starting at the top of the directory tree /

if  [ $# -ne 1  ]
  then echo "syntax:"
    echo "   findfile file-specification"
    echo
    echo " The file-specification must be either"
    echo " a file name or a file spec containing"
    echo " wildcards. If wildcards are included,"
    echo " they must be preceded with a \ (backslash)."
    echo
    echo " examples:"
    echo
    echo "      findfile mfile.txt"
    echo
    echo "      findfile \*.c"
else
    find / -name $1 -print
fi
The program tests that the user has entered one file specification with the test command:
if  [ $# -ne 1  ]
Note that the opening bracket ([) is followed by a space, and the closing bracket is preceded by a space. The $# is a shell variable that contains the number of arguments on the command line used to start the shell script. The -ne condition tests for "not equal." If the number of arguments on the command line does not equal 1 then a syntax/usage type message is displayed, otherwise a find request is launched using the root directory "/" as the starting point.

Access denied
Security considerations prevent find from having access to all directories. Unless you are root, find will probably display error messages indicating that it could not gain access to various directories. Once you have seen what else find can do, you will understand why users should not be allowed to run wild with it. Here is an example of what your screen might look like during a find search (executed from a Bourne shell):

congo$ findfile \*.txt
find: cannot open /etc/auth
find: cannot chdir to /etc/ps
find: cannot chdir to /rcd0/rc
/usr/tom/minutes.txt
/usr/tom/logs.txt
/usr/sally/expense.txt
find: cannot open /usr/sam
find: cannot open /usr/theboss
 
This display is a combination of two types of messages. The "find: cannot open or chdir" messages are error messages caused by some inability to access a directory or a file. The filename messages such as "/usr/tom/minutes.txt" are the output of the "find -print" option. Error messages can clutter up a display. A user, other than a system administrator, often cannot get into every directory, prompting many instances of the "cannot find" error message. The easiest solution is to suppress the error messages "cannot find" and "cannot chdir" by redirecting errors away from the screen. Errors can be sent to /dev/null by changing the find line in findfile (see Listing 1) to read:
find / -name $1 -print 2>/dev/null
The 2>/dev/null will cause messages that are supposed to be printed on stderr (errors messages) to be sent to the null device. The null device is a long way of saying "nowhere." Those messages simply disappear. Listing 2 incorporates this change in the shell script. Listing 2 -- findfile revisited
#!/bin/sh
# ----------------------------------------------------------
# findfile file locator
#
# syntax:
#    findfile filespec
# ----------------------------------------------------------
# The number of command line arguments must be exactly 1. Otherwise
# display a usage message. If arguments are OK run the find starting
# at the top of the directory tree /
if  [ $# -ne 1  ]
  then echo "syntax:"
    echo "   findfile file-specification"
    echo
    echo " The file-specification must be either"
    echo " a file name or a file spec containing"
    echo " wildcards. If wildcards are included,"
    echo " they must be preceded with a \ (backslash)."
    echo
    echo " examples:"
    echo
    echo "      findfile mfile.txt"
    echo
    echo "      findfile \*.c"
else
    find / -name $1 -print 2>/dev/null
fi

Multiple options
Now we will expand on find further. What else can you do to a file after you have found it? The sky is the limit if you have the access privileges. You have seen -print as one option. The other powerful option is -exec. The syntax for the exec option is:

     -exec a_command \;
Note that the backslash and the semicolon must be included as shown. Usually you want to execute a command on the file just found. The find program uses a left and right curly brace ({}) to represent the name of the file just found as in:
     -exec a_command {} \;
Looking at a practical example makes this clearer. Going back to Listing 2. Let's assume that instead of just printing the file name of any file found, we want a full ls -l style listing for that file. The version of a find command that would do this is:
     find / -name $1 -exec ls -l {} \;
In English this would read: start searching from the top of the directory tree for any file named as given in the passed argument. When a file is found, execute an ls -l filename command on that file. This repeats listing 1 using the new syntax and the /dev/null trick. Screen 2 is an example of what the screen output might look like when searching for \*.txt. Listing 3 -- findfile redux
#!/bin/sh
# ----------------------------------------------------------
# findfile file locator
#
# syntax:
#    findfile filespec
# ----------------------------------------------------------
# The number of command line arguments must be exactly 1.  Otherwise
# display a usage message. If arguments are OK run the find starting
# at the top of the directory tree /

if  [ $# -ne 1  ]
  then echo "syntax:"
    echo "   findfile file-specification"
    echo
    echo " The file-specification must be either
    echo " a file name or a file spec containing
    echo " wildcards. If wildcards are included,"
    echo " they must be preceded with a \ (backslash)."
    echo
    echo " examples:"
    echo
    echo "      findfile mfile.txt"
    echo
    echo "      findfile \*.c"
else
    find / -name $1 -exec ls -l {} \; 2>/dev/null
fi

Here is what your screen might look like when searching for \*.txt, using the findfile as in Listing 3:

congo$ findfile \*.txt

-rw-r--r-- 1   tom     group  1544  Jun 12 1997
/usr/tom/minutes.txt
-rw-r--r-- 1   tom     group  1087  Jan  1 1997
/usr/tom/logs.txt
-rw-rw-rw- 1   sally   group  1226  Jan  6 1997
/usr/sally/expense.txt

The syntax for an exec is awkward, but easy to follow once you have the hang of it. It is the -exec flag followed by a Unix command containing {} if the name of the file is used in the command, followed by \; to close the -exec command.

User, size, atime: finding more than just names
We have only looked at -name as the method of matching searched files, but find provides a set of other search flags that go beyond -name.

You are working peacefully at your desk one day when the intercom buzzes. N.E. Programmer who was the lead programmer on the new Widget account is no longer with the company. Your job: clean up any loose ends he may have left behind. After cleaning up his home directory and files, the next step would be searching the system for files that he created. To do this use the -user flag to search for files owned under his logon initials, nep.

find / -user nep -exec ls -l {} \; >nepfiles.txt
In English: search from the root directory for any files owned by nep and execute an ls -l on the file when any are found. Capture all output in nepfiles.txt. This follows the four-part command structure of find that was discussed at the beginning of this article. The find command itself is the first part, the root directory is the point at which to begin the search, the files to search for are "-user nep" and finally, the "do what" part says that when any matching files are found, execute an ls -l on the file. Another useful search option is finding files by size. The -size flag allows you to do this. It is very useful for locating large files on your system. The default units for -size is blocks and allows you to search for files by size in number of blocks. I find it easier to think in bytes, and there are options to allow searches using bytes. First we'll look at the defaults.
find . -size 4 -print
This command will print the names of all files that are four blocks long, using the current directory as a starting point. You may add a + or - (minus) in front of the number, to specify greater than or less than. The following finds files larger than 20 blocks.
find . -size +20 -print
If you add a "c" after the number, the number is interpreted as characters (bytes) instead of blocks. The following command will find all files larger than one million bytes.
find / -size +1000000c -print
This alone can be used to create a useful utility that will search your system for large files, but if we look at another find option first we can create something more flexible. The last access time can also be tested using the -atime switch.
find / -atime 2 -print
This command finds files accessed two day ago (the day before yesterday). An additional + and - (minus) can also be used here for greater than and less than.
find / -atime +30 -print
This prints files that have not been accessed in the last 30 days. The find search criteria can be combined. The following command will locate and list all files that were last accessed more than 100 days ago, and whose size exceeds 500,000 bytes.
find / -atime +100 -size +500000c -print
Again the four-part syntax of find holds here, but the search criteria in part 3 has become the combined: "-atime +100 (and) -size +500000c." By combining these two find command options, you can track down large files that are not used: the files that uselessly chew up disk space. The findfat shell script listed below will accept age and bytes on the command line. The error handling for missing command arguments is more useful. If the arguments are missing, the script asks the user to enter the values. The disk is searched for these large old files, and a detailed directory entry is displayed for any found. Listing 3 -- findfat locates bloated files
# ----------------------------------------------------------
# findfat file locator
#
# syntax:
#    findfat age bytes
# ----------------------------------------------------------
# if the number of arguments is not 2, then ask
# the user to enter the parameters.
# The parameters are number of days to use to consider a file
# old, and number of bytes to use to consider a file fat.
if  [ $# -ne 2  ]
      then
        echo "How many days make a file old?"
        read age
        echo "How many bytes make a file fat?"
        read bytes
else
        age=$1
        bytes=$2
fi

echo Locating files older than $age days and larger than
$bytes bytes
find / -atime +${age} -size +${bytes}c -exec ls -l {} \;
2>/dev/null

I hope that I have provided you with some useful utilities and enough information to illustrate some of the basics on find. Using the four-part command approach to understanding find should also make it easier for you to read the find man page entry and understand what it is doing. Happy hunting!


Click on our Sponsors to help Support SunWorld


Resources


About the author
Mo Budlong is president of King Computer Services, Inc. and has been involved in Unix development on Sun and other platforms for over 15 years. King Computer Services, Inc. specializes in Unix and Client/Server consulting and training and currently publishes the COBOL Just In Time Course, a crash COBOL course to train staff for the Year 2000 problem. Reach Mo at Mo.Budlong@sunworld.com.

What did you think of this article?
-Very worth reading
-Worth reading
-Not worth reading
-Too long
-Just right
-Too short
-Too technical
-Just right
-Not technical enough
 
 
 
    

SunWorld
[Table of Contents]
Subscribe to SunWorld, it's free!
[Search]
Feedback
[Next story]
Sun's Site

[(c) Copyright  Web Publishing Inc., and IDG Communication company]

If you have technical problems with this magazine, contact webmaster@sunworld.com

URL: http://www.sunworld.com/swol-06-1997/swol-06-unix101.html
Last modified: