Regular expressions

Subject: Compilers
Links: Strings and Grammars

Assume that two expressions e1 and e2 generate the languages L1 and L2, respectively. Concatenation, is then defined as e1e2:={xyxL1,yL2}. Alternation, which is denoted as | is the union of the two languages, e1|e2:={xL1xL2}. Closure, which is represented by braces { }, denotes the repetition of the expression n times for n<ω, {e1}:={xL1}.

Def: Regular expressions are those expression that can be constructed from the following rules:

Def: Two regular expression are equal (=) or equivalentif they denote the same language.

Finite-State Automatons

Def: A deterministic finite-state automaton is a 5-tuple (K,VT,M,S,Z) where:

A sentence t is said to be accepted by a DFA if M(S,t)=P for some DFA F=(K,VT,M,S,Z) such that tVT and PZ. Thus we define $$L(F) := {t \mid M(S, t) \in Z \land t\in V_T^*}.$$

Nondeterministic Finite-State Automatons

Def: A nondeterministic finite-state automaton is 5-tuple (K,VT,M,S,Z) where:

Th: Let F=(K,VT,M.S,Z) be a nondeterministic finite-state automaton accepting a set of strings L. Define a DFA F=(K,VT,M,S,Z) as follows:

Def: A nondeterministic finite-state automaton with ε-transitions is a 5-tuple (K,VT,M.S,Z), where K, VT, S and Z are the same as for a NFA and M is a mapping K×(VT{ε}) into 2K.

Def: ε-Closure is a set of states of a NFA with ε-transitions, say F such that ε-closure for some state of F, call it q, includes all attainable states from the state by making state transition with ε-transitions. This is denoted by εClosure(q).

Th: Define F=(K,VT,M.S,Z) as a NFA with ε-transitions. Then there exists an NFA F without ε-transitions such that L(F)=L(F).

Th: There exists a nondeterministic finite-state auntomaton F=(K,VT,M.S,Z) which accepts the language generated by the regular grammar G=(VN,VT,S,Φ).

Th: There exists a regular grammar G=(VN,VT,S,Φ) which produces the language accepted by a given deterministic finite-state automaton F=(K,VT,M.S,Z).