|
Using
|
The man pages onfind
aren't quite as clear as they could be. This month, we de-obfuscate the powerfulfind
command, explaining how it works and showing you how to fine tune searches by name, file size, date altered, and date created. (2,700 words)
Mail this article to a friend |
The find
command is one of the most powerful tools available to a
Unix system administrator, but its command syntax is awkward and
often poorly explained. A typically cryptic description of find
goes
something like this:
find path-list expressionAfter chasing through the man pages you find that an expression can be either a criteria for selecting a file or an action to perform on a file. It is possible to simplify the
find
syntax. The find
command will
search from a starting directory down through subdirectories,
locating files that match your specified search criteria. Find will
then execute a command on the found file.
Though the man pages are technically correct in stating that the
find
command has only three parts, it is useful to think of it as
having four:
1 | 2 | 3 | 4 |
find | starting where | find which files | do what |
In more detail, the parts are:
find
command itself (the word "find
"), that is needed to
start the program.
|
|
|
|
Locating files
The following is an example of a find
command that will locate all
files named "minutes.txt." The search starts at your own home
directory and works its way down through subdirectories. On each
`hit' it prints the file name of the file found.
find $HOME -name minutes.txt -printIn this example part 1 of the command is the
find
command itself.
Part 2 is the starting directory, $HOME. Part 3 lists the files to
find, and this part of the command is "-name minutes.txt." The "do
what" portion of the command is "-print."
Here are the three most common starting directories.
$HOME the users home directory . the current directory / the root directory (searches the whole system)
Using find
to locate files by name is probably the most common use
for the command. It is so common that there are some quick points
here on the use of the -name option with the command. This option is
followed by the name you want to locate. The file name that you want
to find may contain wildcards as in *.txt to search for all files
ending in .txt. When a wild card is included in a file name, the
shell attempts to expand these wild cards before giving the
arguments to find
. You want find
to receive the argument exactly as
"*.txt" and not as an already expanded list of file names such as
notes.txt, mydoc.txt, and so on.
In order to protect the asterisk from being expanded by the shell, it is necessary to use a backslash to escape the asterisk as in:
find $HOME -name \*.txt -printThe backslash in front of the asterisk prevents *.txt from being expanded by the shell. Find receives the argument as "-name *.txt", which is what you wanted in the first place. The same applies to the "?" wildcard. The following will locate all files with a one character extension and will print their names.
find $HOME -name \*.\? -printNote the escapes in front of both the asterisk and the question mark. This simple syntax allows you to create a file locating utility that can be used to track down lost files. Here I use a shell script named
findfile
, which can be
used as a quick way of entering a file search (see Listing 1). Use
vi to create the file and then make it executable by entering:
chmod a+x findfileIn fact most of this shell script is the error checking logic and usage information. Listing 1 --
findfile
#!/bin/sh # ----------------------------------------------------- # findfile file locator # # syntax: # findfile filespec # ----------------------------------------------------- # The number of command line arguments must be exactly 1. Otherwise # display a usage message. If the arguments are OK run the find # starting at the top of the directory tree / if [ $# -ne 1 ] then echo "syntax:" echo " findfile file-specification" echo echo " The file-specification must be either" echo " a file name or a file spec containing" echo " wildcards. If wildcards are included," echo " they must be preceded with a \ (backslash)." echo echo " examples:" echo echo " findfile mfile.txt" echo echo " findfile \*.c" else find / -name $1 -print fiThe program tests that the user has entered one file specification with the test command:
if [ $# -ne 1 ]Note that the opening bracket ([) is followed by a space, and the closing bracket is preceded by a space. The $# is a shell variable that contains the number of arguments on the command line used to start the shell script. The -ne condition tests for "not equal." If the number of arguments on the command line does not equal 1 then a syntax/usage type message is displayed, otherwise a
find
request is
launched using the root directory "/" as the starting point.
Access denied
Security considerations prevent find
from having access to all
directories. Unless you are root, find
will probably display error
messages indicating that it could not gain access to various
directories. Once you have seen what else find
can do, you will
understand why users should not be allowed to run wild with it. Here
is an example of what your screen might look like during a find
search (executed from a Bourne shell):
congo$ findfile \*.txt find: cannot open /etc/auth find: cannot chdir to /etc/ps find: cannot chdir to /rcd0/rc /usr/tom/minutes.txt /usr/tom/logs.txt /usr/sally/expense.txt find: cannot open /usr/sam find: cannot open /usr/thebossThis display is a combination of two types of messages. The "
find:
cannot open or chdir" messages are error messages caused by some
inability to access a directory or a file.
The filename messages such as "/usr/tom/minutes.txt" are the output
of the "find -print" option.
Error messages can clutter up a display. A user, other than a system
administrator, often cannot get into every directory, prompting many
instances of the "cannot find" error message. The easiest solution
is to suppress the error messages "cannot find" and "cannot chdir"
by redirecting errors away from the screen. Errors can be sent to
/dev/null by changing the find
line in findfile
(see
Listing 1) to read:
find / -name $1 -print 2>/dev/nullThe 2>/dev/null will cause messages that are supposed to be printed on stderr (errors messages) to be sent to the null device. The null device is a long way of saying "nowhere." Those messages simply disappear. Listing 2 incorporates this change in the shell script. Listing 2 --
findfile
revisited
#!/bin/sh # ---------------------------------------------------------- # findfile file locator # # syntax: # findfile filespec # ---------------------------------------------------------- # The number of command line arguments must be exactly 1. Otherwise # display a usage message. If arguments are OK run the find starting # at the top of the directory tree / if [ $# -ne 1 ] then echo "syntax:" echo " findfile file-specification" echo echo " The file-specification must be either" echo " a file name or a file spec containing" echo " wildcards. If wildcards are included," echo " they must be preceded with a \ (backslash)." echo echo " examples:" echo echo " findfile mfile.txt" echo echo " findfile \*.c" else find / -name $1 -print 2>/dev/null fi
Multiple options
Now we will expand on find
further. What else can you do to a file
after you have found it? The sky is the limit if you have the access
privileges. You have seen -print as one option. The other powerful
option is -exec.
The syntax for the exec option is:
-exec a_command \;Note that the backslash and the semicolon must be included as shown. Usually you want to execute a command on the file just found. The
find
program uses a left and right curly brace ({}) to represent the
name of the file just found as in:
-exec a_command {} \;Looking at a practical example makes this clearer. Going back to Listing 2. Let's assume that instead of just printing the file name of any file found, we want a full ls -l style listing for that file. The version of a
find
command that would do this is:
find / -name $1 -exec ls -l {} \;In English this would read: start searching from the top of the directory tree for any file named as given in the passed argument. When a file is found, execute an ls -l filename command on that file. This repeats listing 1 using the new syntax and the /dev/null trick. Screen 2 is an example of what the screen output might look like when searching for \*.txt. Listing 3 --
findfile redux
#!/bin/sh # ---------------------------------------------------------- # findfile file locator # # syntax: # findfile filespec # ---------------------------------------------------------- # The number of command line arguments must be exactly 1. Otherwise # display a usage message. If arguments are OK run the find starting # at the top of the directory tree / if [ $# -ne 1 ] then echo "syntax:" echo " findfile file-specification" echo echo " The file-specification must be either echo " a file name or a file spec containing echo " wildcards. If wildcards are included," echo " they must be preceded with a \ (backslash)." echo echo " examples:" echo echo " findfile mfile.txt" echo echo " findfile \*.c" else find / -name $1 -exec ls -l {} \; 2>/dev/null fi
Here is what your screen might look like when searching for \*.txt,
using the findfile
as in Listing 3:
congo$ findfile \*.txt -rw-r--r-- 1 tom group 1544 Jun 12 1997 /usr/tom/minutes.txt -rw-r--r-- 1 tom group 1087 Jan 1 1997 /usr/tom/logs.txt -rw-rw-rw- 1 sally group 1226 Jan 6 1997 /usr/sally/expense.txt
The syntax for an exec is awkward, but easy to follow once you have the hang of it. It is the -exec flag followed by a Unix command containing {} if the name of the file is used in the command, followed by \; to close the -exec command.
User, size, atime: finding more than just names
We have only looked at -name as the method of matching searched
files, but find
provides a set of other search flags that go beyond
-name.
You are working peacefully at your desk one day when the intercom buzzes. N.E. Programmer who was the lead programmer on the new Widget account is no longer with the company. Your job: clean up any loose ends he may have left behind. After cleaning up his home directory and files, the next step would be searching the system for files that he created. To do this use the -user flag to search for files owned under his logon initials, nep.
find / -user nep -exec ls -l {} \; >nepfiles.txtIn English: search from the root directory for any files owned by nep and execute an ls -l on the file when any are found. Capture all output in nepfiles.txt. This follows the four-part command structure of
find
that was
discussed at the beginning of this article. The find
command itself
is the first part, the root directory is the point at which to begin
the search, the files to search for are "-user nep" and finally, the
"do what" part says that when any matching files are found, execute
an ls -l on the file.
Another useful search option is finding files by size. The -size
flag allows you to do this. It is very useful for locating large
files on your system. The default units for -size is blocks and
allows you to search for files by size in number of blocks. I find
it easier to think in bytes, and there are options to allow searches
using bytes. First we'll look at the defaults.
find . -size 4 -printThis command will print the names of all files that are four blocks long, using the current directory as a starting point. You may add a + or - (minus) in front of the number, to specify greater than or less than. The following finds files larger than 20 blocks.
find . -size +20 -printIf you add a "c" after the number, the number is interpreted as characters (bytes) instead of blocks. The following command will find all files larger than one million bytes.
find / -size +1000000c -printThis alone can be used to create a useful utility that will search your system for large files, but if we look at another
find
option
first we can create something more flexible.
The last access time can also be tested using the -atime switch.
find / -atime 2 -printThis command finds files accessed two day ago (the day before yesterday). An additional + and - (minus) can also be used here for greater than and less than.
find / -atime +30 -printThis prints files that have not been accessed in the last 30 days. The
find
search criteria can be combined. The following command will
locate and list all files that were last accessed more than 100 days
ago, and whose size exceeds 500,000 bytes.
find / -atime +100 -size +500000c -printAgain the four-part syntax of
find
holds here, but the search
criteria in part 3 has become the combined: "-atime +100 (and) -size
+500000c."
By combining these two find
command options, you can track down
large files that are not used: the files that uselessly chew up disk
space.
The findfat
shell script listed below will accept age
and bytes on the command line. The error handling for missing command
arguments is more useful. If the arguments are missing, the script
asks the user to enter the values. The disk is searched for these
large old files, and a detailed directory entry is displayed for any
found.
Listing 3 -- findfat
locates bloated files
# ---------------------------------------------------------- # findfat file locator # # syntax: # findfat age bytes # ---------------------------------------------------------- # if the number of arguments is not 2, then ask # the user to enter the parameters. # The parameters are number of days to use to consider a file # old, and number of bytes to use to consider a file fat. if [ $# -ne 2 ] then echo "How many days make a file old?" read age echo "How many bytes make a file fat?" read bytes else age=$1 bytes=$2 fi echo Locating files older than $age days and larger than $bytes bytes find / -atime +${age} -size +${bytes}c -exec ls -l {} \; 2>/dev/null
I hope that I have provided you with some useful utilities and
enough information to illustrate some of the basics on find.
Using
the four-part command approach to understanding find
should also
make it easier for you to read the find
man page entry and
understand what it is doing. Happy hunting!
|
Resources
find
commandfind
in an archiving strategyfind
as one of the tools used to detect a hacker breakfind
http://www.science.nd.edu/scf/newsletter/UNIX.html
About the author
Mo Budlong is president of King Computer Services, Inc. and has been
involved in Unix development on Sun and other platforms for over 15 years.
King Computer Services, Inc. specializes in Unix and Client/Server consulting
and training and currently publishes the COBOL Just In Time Course, a crash
COBOL course to train staff for the Year 2000 problem.
Reach Mo at Mo.Budlong@sunworld.com.
If you have technical problems with this magazine, contact webmaster@sunworld.com
URL: http://www.sunworld.com/swol-06-1997/swol-06-unix101.html
Last modified: