The command line utilities that I use most for searching/finding files in Linux operating systems are:

find search for files in a directory
grep print lines matching a pattern
locate find files by name

find

The basic format of a find command is:

find [location] [criteria] [actions]

Some common examples are listed below:

Find files by some name pattern:

find /etc -name "*.conf"

Find files by size

find /var/log -size +1M

There are other really useful filter options like: -empty, -newer, -perm, -type …

It’s worth to mention that you can add -not to invert the search:

find /var/log -not -size +1M

which will find files that are less than 1M.

You can perform further actions on results with -exec option. For example, if you want to remove log files that are last modified 7 days ago and the size is larger than 1M:

find /var/log -name "*.log" -size +1M -mtime +7 -exec rm {} ';'

# or you can use xargs to achieve the same result
find /var/log -name "*.log" -size +1M -mtime +7 | xargs rm

grep

To search for files based on the text content, you can use grep. The basic example would be:

# find file that have "error" in it
grep "error" /var/log

By default, grep prints out lines matching the pattern, you can add -l option to only print the file name. Besides that, the other common options are:

  • -i to ignore case
  • -r to search the directory recursively
  • -e to use regexp pattern
  • -v to invert match

As an example, let’s list all the files that start with word error ignoring the cases, under /var/log directory:

grep -i -r -l -e "^error:" /var/log

Lastly, I’d like to share one handy example that I use very often. This command finds top 10 IPs that send POST requests inside Nginx’s access log:

grep -i post /var/log/nginx/access.log | cut -d' ' -f1 | sort | uniq -c | sort -rn | head -n 10

In this example, the log fields are separated by space, and it uses cut -d' ' -f1 to get the first field which is IP address, after that sort and count on uniq entries, then finally use head to get top N entries. I use this example as template to solve a lot of similar tasks at work.

locate

The locate command is another useful tool, the differences comparing to find are:

  • locate reads from a database that is periodically updated by updatedb.
  • locate searches your entire filesystem.

This means that locate command can be very fast, but also may not always up to date.

To find all the pem files, you can do:

locate .pem

locate can only find files that are already in the database. And the database is usually updated by a cron job that runs daily. However, you can manually update the database:

sudo updatedb