Regular Expressions
The family of regular languages (also called or type ) is the simplest language family.
- Regular expressions can be ambiguous if there are multiple structurally different derivations that result in the same sentence. Sufficient conditions for ambiguity exist.
- Regular expressions can be extended with operators like union, concatenation, and star.
- A language on an alphabet is regular if it can be defined by a regular expression. So only if it is defined by concatenation, union and star over the elementary languages of ().
- Regular expressions have limits, as they cannot represent certain languages, such as those with unbalanced nesting or varying numbers of elements. To represent these languages, generative grammars must be used instead.
Regex
Regex basics:
x
thex
character.
any character except newline[xyz]
meansx
ory
orz
[a-z]
any character betweena
andz
[^a-z]
any character except those betweena
andz
Said R
a regular expression:
RS
concatenation ofR
andS
R|S
eitherR
orS
R*
zero or more occurrences ofR
R+
one or more occurrences ofR
R?
zero or one occurrence ofR
R{m,n}
a number orR
occurrences ranging fromn
tom
R{n,}
at least n occurrencesR{n}
exactly n occurrences ofR