Linux Command Line - Regular Expressions
Egrep
egrep is a program which will search a given set of data and print every line which contains a given pattern. It is an extension of a program called grep. It’s name is odd but based upon a command which did a similar function, in a text editor called ed. It has many command line options which modify it’s behaviour so it’s worth checking out it’s man page. ie the -v option tells grep to instead print every line which does not match the pattern.
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ egrep 'mellon' data.txt
Mark watermellons 12
Oliver rockmellons 2
-v to tells egrep to instread print every line which does not match the pattern
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ egrep -v 'mellon' data.txt
Fred apples 20
Susy oranges 5
Robert pears 4
Terry oranges 9
Lisa peaches 7
Susy oranges 12
Mark grapes 39
Anne mangoes 7
Greg pineapples 3
Betty limes 14
check their line number as well
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ egrep -n 'mellon' data.txt
3:Mark watermellons 12
11:Oliver rockmellons 2
check how many lines did match
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ egrep -c 'mellon' data.txt
2
Regular expression overview
identify any line with two or more vowels in a row
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ egrep '[aeiou]{1,}' data.txt
Fred apples 20
Susy oranges 5
Mark watermellons 12
Robert pears 4
Terry oranges 9
Lisa peaches 7
Susy oranges 12
Mark grapes 39
Anne mangoes 7
Greg pineapples 3
Oliver rockmellons 2
Betty limes 14
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ egrep '[aeiou]{2,}' data.txt
Robert pears 4
Lisa peaches 7
Anne mangoes 7
Greg pineapples 3
any line with a 2 on it which is not the end of the line
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ egrep '1.+' data.txt
Mark watermellons 12
Susy oranges 12
Betty limes 14
- $ - matches the end of the line.
- ^ - matches the beginning of the line.
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ egrep '2$' data.txt
Mark watermellons 12
Susy oranges 12
Oliver rockmellons 2
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ egrep '^F' data.txt
Fred apples 20
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ egrep '^[A-D]' data.txt
Anne mangoes 7
Betty limes 14
each line which contains either ‘Fred’ or ‘Robert’ or ‘Susy’
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ egrep 'Fred|Robert|Susy' data.txt
Fred apples 20
Susy oranges 5
Robert pears 4
Susy oranges 12
grep
grep [options] pattern [files]
Note that if pattern does not include spaces or any other special characters then you don’t need to use quotes {“.info}
- -i: Ignored, case for matching
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep -i "fred" data.txt
Fred apples 20
- -c : this prints only a count of the lines that match a pattern
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep -c "oranges" data.txt
3
- -l : display list of a filenames only, we can just display the files that contains the given string/pattern
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep -l "apple" data.txt data.txtsearch all files in current directory:
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep -l "Fred" * data.txt grep: test1: Is a directory - -w : display only the matched pattern, by default, grep displays the entire line which has the matched string.
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep -l "apple" data.txt
data.txt
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep -w "apple" data.txt
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep -w "apples" data.txt
Fred apples 20
- -o : displays only the matched string by using the -o option
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep -o "Fred" data.txt:Fred - -n : show line number while displaying the output
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep -n "oranges" data.txt
2:Susy oranges 5
5:Terry oranges 9
7:Susy oranges 12
- -v : displays lines that are not matched with the specified search string pattern
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep -v "^[A-K]" data.txt
Susy oranges 5
Mark watermellons 12
Robert pears 4
Terry oranges 9
Lisa peaches 7
Susy oranges 12
Mark grapes 39
Oliver rockmellons 2
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep "^[M-Z]" data.txt
Susy oranges 5
Mark watermellons 12
Robert pears 4
Terry oranges 9
Susy oranges 12
Mark grapes 39
Oliver rockmellons 2
showing lines that end with a string
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep "2$" data.txt
Mark watermellons 12
Susy oranges 12
Oliver rockmellons 2
- -e : specifies expression with -e option, can use multiple times:
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep -e "Fred" -e "Mark" data.txt Fred apples 20 Mark watermellons 12 Mark grapes 39
Display directories:
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ ls -l | grep "^d"
drwxrwxr-x 2 hadley hadley 4096 Okt 20 21:33 test1
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ ls -l | grep "^-"
-rw-rw-r-- 1 hadley hadley 197 Okt 21 19:51 data.txt
-rwxrwxrwx 1 hadley hadley 92 Okt 20 21:28 file.txt
-rw-rw-r-- 1 hadley hadley 0 Okt 20 21:42 test.cpp
-rw-rw-r-- 1 hadley hadley 0 Okt 20 21:01 ttt
-rw-rw-r-- 1 hadley hadley 0 Okt 20 21:42 x
-rw-rw-r-- 1 hadley hadley 0 Okt 20 21:43 x1
-rw-rw-r-- 1 hadley hadley 0 Okt 20 21:42 y
^: use^with[], the pattern must not contain any character in the set specified
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep "^[^A-K]" data.txt
Susy oranges 5
Mark watermellons 12
Robert pears 4
Terry oranges 9
Lisa peaches 7
Susy oranges 12
Mark grapes 39
Oliver rockmellons 2
. to match any one character
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep "pears.4" data.txt
Robert pears 4
\ to ignore the special meaning of the character following it
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep 'test\.\\test\[ss\]' data.txt
test.\test[ss]
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep -F "test.\test[ss]" data.txt
test.\test[ss]
* : zero or more occurrences of the previous character
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep "a*b*c*d*e" data.txt
Fred apples 20
Susy oranges 5
Mark watermellons 12
Robert pears 4
Terry oranges 9
Lisa peaches 7
Susy oranges 12
Mark grapes 39
Anne mangoes 7
Greg pineapples 3
Oliver rockmellons 2
Betty limes 14
test.\test[ss]
aaabbbbccdddddeee
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep "aaab*c*d*e" data.txt
aaabbbbccdddddeee
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep Betty *.txt
data.txt:Betty limes 14
Note the difference between * in pattern and * in files.
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep "*tt" data.txt
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep "tt" data.txt
Betty limes 14
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep "B*tt" data.txt
Betty limes 14
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep tt *.txt
data.txt:Betty limes 14
check how many processed are running on your system as youtube-music
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ ps -ef | grep -c youtube-music
11
- Recursive Search
search all files inside current directory:
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep -r Betty *
data.txt:Betty limes 14
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ ls -l test1/
total 4
-rw-rw-r-- 1 hadley hadley 230 Okt 21 21:09 data.txt
-rw-rw-r-- 1 hadley hadley 0 Okt 20 19:59 test1.txt
-rw-rw-r-- 1 hadley hadley 0 Okt 20 20:03 test2.txt
-rw-rw-r-- 1 hadley hadley 0 Okt 20 21:32 test3.cpp
-rw-rw-r-- 1 hadley hadley 0 Okt 20 21:33 test4.py
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep -r Betty *
data.txt:Betty limes 14
test1/data.txt:Betty limes 14
search files with certain extension:
grep -inr --include=\*.extension 'searchterm' ./
-
-r: recursively
-
-i: ignore-case
-
-n: each output line is preceded by its relative line number in the file
-
–include: all
*.txt: text files -
./: Start at current directory.
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep -inr --include=*.txt 'Betty' ./
./test1/data.txt:12:Betty limes 14
./data.txt:12:Betty limes 14
Note difference below:
Note the difference between -r and -R and difinition of symbolic links:
https://linuxize.com/post/how-to-use-grep-command-to-search-files-in-linux/
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep -r Betty *
data.txt:Betty limes 14
test1/data.txt:Betty limes 14
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep -r Betty *.txt
data.txt:Betty limes 14
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ grep -r Betty .
./test1/data.txt:Betty limes 14
./data.txt:Betty limes 14
search file name with certain extension recursively:
hadley@hadley-MacBookPro:~/Developments/parentDirectory$ find . -iname '*.txt'
./file.txt
./test1/test2.txt
./test1/test1.txt
./test1/data.txt
./data.txt
Note that iname does a case insensitive search.