Minimum Grep You Need to Know

by Buddy Lindsey on April 2, 2013

You need to find a phrase in over 200 files worth of code. Manual searching is not a feasible option. If you are like me you know about grep, but it has always made you nervous. It is so powerful reading the man page was like a tech manual for an engine. Fortunately, getting the benefits of grep with little pain is easy, once you finally figure it out.

Over the last few months I have had to use grep more and more, and I would say I use the same type of search 80% of the time. It gets me what I need quickly and efficiently without much fuss.

What is Grep

The best full explanation comes from the grep man page.

Grep searches the named input FILEs (or standard input if no files are named, or the file name – is given) for lines containing a match to the given pattern. By default, grep prints the matching lines.

My explanation is it finds stuff in files and shows you where it is. It is amazingly useful because of the speed, and by showing the line it found it on along with the file name so you can more easily compare if that is what you need over just a file name.

Using Grep

Grep is very powerful, but I get by with 3 variations day to day for most of my needs. First lets look at how to structure your grep

grep <options> <search-term> <location>

This is important to remember as it can be frustrating when you forget and nothing works.

  • options – these are the different flags that can help you get more robust or targeted results back.
  • search-term – this takes any pattern/regular expression to match against all the files you are searching
  • location – this is where you put a directory or leave blank to search stdout/stdin

Grep with Other Commands

If you don’t put in a location it searches stdout/stdin. That is useful if you pipe (|) a bunch of data to grep for searching it. A mundane example is

ls -lha | grep buddy

This does a normal ls -lha and passes the result to grep. From there it only returns lines that have the word “buddy” in them.

To show the regular expression usage you can do:

ls -lha | grep ^d

This returns only results where the line starts with d. In the case of ls it means only directories are returned.

Grep’ing Files

Where you will probably spend most of your time is searching for text inside of files. Mostly you will need to know the file and line number of where the word you are looking for is located.

grep -rn hello .

This searches for hello in every file in the current directory and subdirectory. It then shows you the line in the file and the file number it is on. The options are fairly easy to remember as well:

  • r – recursively search files
  • n – display line numbers

Excluding Directories
Sometimes you get too many results or you get results in the folders you don’t want to search in. One of the projects I work on at work has a .svn folder that needs to stay. So I usually have to not include the directory. Fortunately it is easy.

grep -rn --exclude-dir=.svn hello .

Conclusion

Above is about all you need to know to get started using grep. It is an awesome tool with a lot more features cane you can do some crazy cool searches. It also actually helps you find elusive pieces of code.

Related Posts:

Was this Helpful?

If you found this article useful you might find others useful as well. Please browse the archives and subscribe to the RSS Feed to stay up-to-date.

groovecoder April 2, 2013 at 10:05 am

Another quick way to exclude some results is to pipe the first grep to a grep -v command:

grep -rn hello . | grep -v svn

I usually do this and start appending `grep -v` commands to narrow down the results:

grep -rn “translate” . | grep -v .svn | grep -v .txt …

I think grep must cache results somehow because usually the subsequent grep -v “filters” come back very fast.

Reply

Buddy Lindsey April 2, 2013 at 11:24 am

That is a cool way to do it. I usually keep adding exclude-dir. Sometimes I have gotten a lot of them in there. lol.

Reply

sam April 4, 2013 at 8:12 am

If its directories of source code you are searching in then its also worth looking at ack-grep. its a wrapper around grep that knows to ignore version control directories, backup files and various other things.

Reply

Buddy Lindsey April 4, 2013 at 8:27 am

That is really cool. I will check it out. I do find it annoying to manually have to exclude .svn, and sometimes .git, from my searches.

Reply

Sinai April 4, 2013 at 8:19 am

Thanks for the explanation, why doesn’t
ls -lha | grep bud*
work while
ls -lha | grep buddy
does work?

Reply

Buddy Lindsey April 4, 2013 at 8:43 am

It should work. However I am not sure what other instances you are using * in. From reading it the * tries to match the preceding character and match based on that. I am still not good at regular expressions so I am not sure. Here is a bit more reading on it. Hopefully it help.

http://www.regular-expressions.info/repeat.html

Reply

Sinai April 4, 2013 at 9:02 am

I was looking for object files – anything with a “.o” suffix.
ls -lha | grep *.o
found nothing while
ls -lha | grep whatever.o
found that file.
However
rm *.o
does erase all “.o” files.
Thanks for the response.

Reply

Br.Bill April 4, 2013 at 12:14 pm

The regular expression “*.o” can’t match anything. The * means “any number of the previous character”. There is no previous character for the * to modify, so it can’t work.

ALSO, if you are unix, your shell is expanding the glob “*.o” to ALL the names of files in your current directory that end with “.o”. So before grep can even run on it, it’s turning into something that looks like this:

ls -lha | grep file1.o file2.o obj.o

To get this to work with grep correctly, you need to use this:

ls -lha | grep ‘.*\.o’

because:
1) single quotes prevent the shell from pre-expanding the glob against local filenames it matches
2) in regular expressions, period (.) means “any character”
3) in regular expressions, asterisk (*) means “any number of the previous type of character”
4) in regular expressions, backslash (\) means “use the literal meaning of the next character instead of its regular expression value”.

Thus, ‘.*\.o’ means: match any number of any character followed by a literal period and a lowercase letter O.

Reply

Sinai April 5, 2013 at 7:56 am

It works, thanks!

tcmarsh April 4, 2013 at 9:16 am

bud*
used with grep will look for ‘bud’ zero or more times – it’s a regular expression, not the same as the * is used for in commands like ls. However, doing
ls -lha | grep bud
should give what you are looking for, as the regular expression matched does not have to be the whole line or even a whole word.
Alternatively,
ls -lha | grep bud.*
should do more of what you’re used to, although it isn’t really necessary

Reply

Buddy Lindsey April 4, 2013 at 9:23 am

Such a good explanation. Thank you. I guess I misunderstood how it works when I was reading on it. I really need to sit down and really learn regular expressions thoroughly.

Reply

sam April 4, 2013 at 9:21 am

grep uses regular expression syntax.

using ‘?’ to mean any character, and ‘*’ to mean any number of any character, tends to be known as globbing in unix. the shell uses glob expansion, so that you can perform an action on many files at once.

“ls -lha | grep bud*” will probably give you very unexpected results. your shell will try to glob expand the bud* before running anything, so it will actually execute the command something like
ls -lha | grep buda budb budc
(depending on what files you have in that directory).

Reply

Bob April 4, 2013 at 10:01 am

What you really need to do is escape the * from the shell:
ls -lha | grep bud\*
will work the way you expect.

Reply

Bob April 4, 2013 at 10:05 am

Never mind… bud\* means bu followed by zero or more d… tcmarsh had it right above.

Reply

Leave a Comment

Previous post:

Next post: