Locating Data using the grep Command

Commands covered in this section:

  • grep: global regular expression printing
  • grep gives you the ability to search the contents of files or a directory of files without reading the file into a buffer or memory.
  • It searches one line at at time for the regular expression specified
  • Examples:

    Seaching for text in the output of another command

    Searching for text in a file

    The syntax for the grep command:

    grep [options] regular expression [file1 file2 ... fileN]

    What is a Regular Expression?

    Examples of a simple regualar expression

  • ^p is used to find files in the current directory witch begin with the letter 'p'
  • and count the number of lines output
  • The metacharacter '^' means the beginning of a line.
  • $ grep dev /etc/*

    Options to the grep command

    $ grep -l dev /etc/*
     
  • NOTE: You will also get errors if you don't have permission to read the files or if the file is a directory.
  • Screen out errors by redirecting stderr to /dev/null
  • $ grep -l dev /etc/* 2> /dev/null
    $ grep -l dev /etc/* > /tmp/myout 2>&1

    Metacharacters

    In one of the examples above, we searched for filenames beginning with the letter p using the regex (regular expression) 'grep ^p'.  The '^' is a metacharacter.
    Metacharacter Meaning in grep Meaning in the shell
    . match any character; 
    grep -c '.' somefile; counts the number of non-blank lines in the file
    if followed by filename, exectute filename
    * match zero or more preceding characters;
    grep -n '.*' somefile;
    number all lines (includeing blanks) in file
    match zero or more
    ^ match beginning of line;
    ls /etc | grep '^p';
    also used for negation;
    grep 'q[^u]' filel;
    matches q not followed by u
    bourne shell pipe symbol
    $ match end of line;
    ps -aux | grep 'd]$'
    shell variable, also the user prompt
    \ escape the character following;
    grep '\*' /etc/crontab;
    matches all the lines containing a *
    escape the character following
    [ ] match one from this set or range;
    grep '[Yy]ou' somefile
    ls /etc/rc.d | grep 'rc[0-6]'
    match from this set or range 
    { } matches a specific number of characters or  between min and max number;
    ls | egrep '^K[0-9]{3}';
    would match files starting with K followed by 3 numbers
     
    + match one or more preceding characters;
    ls | egrep 'K[0-9]+';
    match one or more preceding
    ? match zero or one preceding characters;
    egrep 'colou?r' somefile;
    match one character

    So in the previous example it would have been safer to do the following:

    $ ls /etc | grep '^p' | wc -l
    Find filenames with patterns at the end of their names.
    $ ls /etc | grep 'p$'
    Find lines with patterns at the end of lines
    $ grep 'bash$' /etc/passwd
    What would the following command print to the screen?
    $ grep -v 'bash$' /etc/passwd

    The shell and metacharacters

    $ who | grep $USER
    pattyo   tty1     Nov  3 01:02
    pattyo   pts/0    Nov  3 01:03 (:0)
    pattyo   pts/1    Nov  3 01:03 (:0)
    pattyo   pts/2    Nov  3 01:03 (:0)
    $ ps aux | grep '^$USER'

    Instructing grep to treat a metacharacter as ordinary.

    Selecting only blank lines from a file:
    $ grep '^$' /etc/hosts
    Printing out a file without the comment lines:
    $ grep -v '^#' /etc/hosts
    Searching for the word 'the' or the word 'The' Locating a range of characters
    $ ls /etc/rc.d | grep 'rc[0-6].d'
    rc0.d
    rc1.d
    rc2.d
    rc3.d
    rc4.d
    rc5.d
    rc6.d
    $ egrep 'sd[a-z][0-9]+' /etc/fstab
    /dev/sdb5    swap      swap    defaults   0 0
    /dev/sda5    swap      swap    defaults   0 0
    /dev/sdb1   /home      ext3    defaul     1 2
    NOTE: If we used a star (*) instead of a plus (+) we will match even if there are no numbers after after the [a-z] character. The plus (+) matches only if at least one number exists. The star (*) matches zero or more numbers.

    Using brackets to search for literal metacharacter.

    Searching for a specific word.

    $ grep '\<[Tt]he\>' file
    $ grep '\<the' file

    Matching any Character.

    $ grep '.' file
    $ grep -v '.' file
    $ grep '\<.he\>' file
    $ grep 's.*s' file

    egrep

    This utilizes the egrep command to combine the above two searches Exmples:
    $ egrep -v '^#|^$' /etc/syslog.conf

    *.err;kern.debug;daemon.info;user.none;local3.none /var/log/syslog
    *.alert;kern.err;daemon.err root
    *.emerg *
    mail.debug /var/log/syslogs/sendmail
    auth.info /var/log/syslogs/authlog
    authpriv.debug /var/log/secure
    *.info @nloghost
    local0.info /var/log/syslogs/poplog
    local1.info /var/log/syslogs/local1
    local2.info /var/log/syslogs/sudo.log
    local3.info /var/log/syslogs/local3
    local4.info /var/log/syslogs/local4
    local5.info /var/log/syslogs/local5
    local6.debug /var/log/syslogs/local6
    local7.debug /var/log/syslogs/local7
    $ cd /var/mail
    $ egrep '^(From|Subject):' jars
    $ ps aux | egrep 'd$|d]$'