using Regex and linux commands(grep or egrep?) to find specific strings -


note: not sure regex's correct since textbook @ school not explain/teach regex's of form of math form such dfa's/nfa

i appreciate suggestions or hints

question:

(a) find occurrences of 3 letter words in text begin `a' , end 'e';

(b) find occurrences of words in text begin `m' , end 'r';

my approach:

a) ^[a][a-za-z][e]$ (how distinguish between 3 letter words , words?)

b) ^[m][a-za-z][r]$

also want use these regex's in linux following command work?:

grep '^[a][a-za-z][e]$' 'usr/dir/.../text.txt' 

or should use egrep in way:

find . -text "*.txt" -print0 | xargs -0 egrep '^[a][a-za-z][e]$' 

you can use word boundary \b match start , end of word:

a) find occurrences of 3 letter words in text begin `a' , end 'e';

grep -o '\ba[a-za-z]e\b' 

the pattern matches word boundary, following a, single character , following e , word boundary.

b) find occurrences of words in text begin `m' , end 'r';

grep -o '\bm[a-za-z]*r\b' 

the pattern matches word boundary, m 0 ore more characters (thorugh * quantifier), r , word boundary again.


further i'm using options -o outputs every match on own line rather outputting whole line of input contains match.


btw, option -w - matching whole words - can simplify above patterns to:

a)

grep -wo 'a[a-za-z]e' 

and b)

grep -wo 'm[a-za-z]*r' 

thanks @anubhava!


you asked egrep. egrep can't simplify or optimize patterns. grep absolutely fine.


Comments

Popular posts from this blog

c++ - Delete matches in OpenCV (Keypoints and descriptors) -

java - Could not locate OpenAL library -

sorting - opencl Bitonic sort with 64 bits keys -