using Regex and linux commands(grep or egrep?) to find specific strings -
note: not sure regex's correct since textbook @ school not explain/teach regex's of form of math form such dfa's/nfa
i appreciate suggestions or hints
question:
(a) find occurrences of 3 letter words in text begin `a' , end 'e';
(b) find occurrences of words in text begin `m' , end 'r';
my approach:
a) ^[a][a-za-z][e]$
(how distinguish between 3 letter words , words?)
b) ^[m][a-za-z][r]$
also want use these regex's in linux following command work?:
grep '^[a][a-za-z][e]$' 'usr/dir/.../text.txt'
or should use egrep in way:
find . -text "*.txt" -print0 | xargs -0 egrep '^[a][a-za-z][e]$'
you can use word boundary \b
match start , end of word:
a) find occurrences of 3 letter words in text begin `a' , end 'e';
grep -o '\ba[a-za-z]e\b'
the pattern matches word boundary, following a
, single character , following e
, word boundary.
b) find occurrences of words in text begin `m' , end 'r';
grep -o '\bm[a-za-z]*r\b'
the pattern matches word boundary, m
0 ore more characters (thorugh *
quantifier), r
, word boundary again.
further i'm using options -o
outputs every match on own line rather outputting whole line of input contains match.
btw, option -w
- matching whole words - can simplify above patterns to:
a)
grep -wo 'a[a-za-z]e'
and b)
grep -wo 'm[a-za-z]*r'
thanks @anubhava!
you asked egrep
. egrep
can't simplify or optimize patterns. grep
absolutely fine.
Comments
Post a Comment