N or Space for next slide, P for previous slide, F for full screen, M for menu

Lec 9: Restricted Computational Models

Boaz Barak (instructor), Brian Sapozhnikov (Head TF), Albert Chalom, Alexis Ross, Charles O’Mara, Daniel Chen, David Li , Jambay Kinley, Jane Ahn, John Shen, Josh Seides, Laura Pierson, Pranay Tankala, Raymond Lin, Theresa Nguyen, Wanqian Yang, William Fu, Hikari Sorensen (Extension TF), Eric Lu (Patel Fellow)


Laptops/phones only on last 5 rows please.

Midterm

Suggestions if midterm grade lower than pset grade:

  1. Focus on reading and sections to get conceptual understanding.
  1. Practice parsing the question and spending time with it on your own.
  1. Defer going to office hours.
  1. Make sure you have true collaboration with your partners.

Phases of this course

Phase 0: Strings, functions, sets, representations.
Take away: Comfort manipulating, sets, functions, functions as objects and inputs.

Phase 1: Computing finite aka fixed input functions (NAND, Boolean circuits)
T/A: Distinguish functions and programs, programs as strings and inputs to other programs.

Phase 2: Computing infinite aka unbounded input functions (NAND++,NAND<<,Turing machines,..)
T/A 1: Still, distinguish functions and programs, and programs as strings. Some functions can’t even be computed!
T/A 2: Unimportance of computational model. Think in algorithms not NAND++ (or Python/C/OCaml)
T/A 3: Reductions as way to relate problems’ difficulty.

Next up: Phase 3: Efficient computation, P and NP and Phase 4: Probabilistic computation.

Sipser’s book

Phases 2 (infinite functions) and 3 (efficient computation, P and NP) well covered in Sisper’s book.

Translate Languages to Functions

This courseSipser
Computational task:Computing a function \(F:\{0,1\}^*\rightarrow \{0,1\}\)Deciding a language \(L \subseteq \{0,1\}^*\)
Input:\(x\in \{0,1\}^*\)\(x\in \{0,1\}^*\)
Goal:Compute \(F(x)\)Decide if \(x\in L\)
(output \(1\) for “Yes”, \(0\) for “No”)

Ladder of abstraction

Turing equivalence

NAND++, NAND<<, Turing machines, RAM machines, \(\lambda\) calculus, Game of life, Scratch, C, Fortran, Javascript, PHP, Python, Java, Julia, Swift, Ruby, Rust, Go, Perl, Fortran, Lisp, Haskell, OCaml, ..

Domain Specific Languages (DSL): HTML, SQL, Matlab, bash, make, UnrealScript, ANTLR , JSON, EMACS Lisp, Map-reduce, Halide, Graphviz dot, XML, UML, VHDL, LaTeX, Solidity, Michelson, ..

“Sub languages”: C preprocessor, type system, templates, regular expressions, list comprehensions, format strings, ..

Why so many languages?

Turing completeness as a bug

We want assurance that:

  • Search for a string will terminate. (regular expressions.)

  • Compiler runs in finite time. (context free grammars.)

  • Program respects type safety. (type system)

  • Access is granted to right people. (access control rules.)

  • Database query terminates. (SQL)

  • Web page is rendered in finite time (HTML)

Restricted computation mechanism

Want formalisms that are:

  • Expressive enough to capture our use cases.
  • Restricted enough so that we can verify semantic properties:
    • Halting: Does program halt in finite time?
    • Equivalence: Do two programs compute the same function?
    • Canonization / normalization: Find “canonical” form for program computing a function.
    • Emptiness/Zero: Does a program compute a nonzero output? (aka “safety”)
    • Finiteness: Does a program output 1 on at most a finite number of inputs? (aka “liveness”)
  • Fast evaluation (time, memory, parallelization)

Today

Two examples of a restricted formalisms: Regular Expressions and Context Free Grammars:

  • Widely used in practice
  • Extremely efficient implementation
  • Elegant theory
  • Can certify many (regular) or some (CFGs) semantic properties (halting, emptiness, finiteness, fulness, equivalence, canonization*,..)

Regular expressions

Def: Regular expression \(exp\) is one of:

  • \(0\), \(1\) or \("\,"\)
  • \((exp')(exp'')\)
  • \(exp' | exp''\)
  • \((exp')^*\).
  • (Also \(\emptyset\) )

\(\Phi_{exp}(x)=1\) iff \(exp\) matches \(x\).

Example: \((00|11)^*\) corresponds to \(\Phi(x)=1\) iff \(x_{2i+1}=x_{2i}\) for all \(i\).

Question 1

Which of the following regular expressions captures the set of strings that contain \(010\) as a substring:

a. \((010)^*\)

b. \((001100)^*\)

c. \((010)(0|1)^*\)

d. \((0|1)^*(010)(0|1)^*\)

Question 1

Which of the following regular expressions captures the set of strings that contain \(010\) as a substring:

a. \((010)^*\)

b. \((001100)^*\)

c. \((010)(0|1)^*\)

d. \((0|1)^*(010)(0|1)^*\)

Question 2

Give a regular expression for the function \(F:\{0,1\}^* \rightarrow \{0,1\}\) s.t. \(F(x)=1\) iff \(x\) does not contain \(010\) as a substring. Hint: Not containing \(010\) means every \(1\) cannot be “isolated” apart from at beginning and end.

Question 2

Give a regular expression for the function \(F:\{0,1\}^* \rightarrow \{0,1\}\) s.t. \(F(x)=1\) iff \(x\) does not contain \(010\) as a substring. Hint: Not containing \(010\) means every \(1\) cannot be “isolated” apart from at beginning and end.

Answer:

  • No isolated \(1\)’s: \(noisol= \left(\;(0^*)(11(1^*))^*(0^*)\; \right)^*\)
  • No \(010\): \(("\,"|1)\;\; noisol \;\; ("\,"|1)\) Also: \(1^*((0^*)11)^*1^*\)

Decidability of regular expressions

Thm 1: There is an algorithm \(P\) such that \(P(exp,x)=1\) iff \(\Phi_{exp}(x)=1\).

Proof: In text - directly following definition.

Thm 2: There is an algorithm \(P\) such that \(P(exp,x)=1\) iff \(\Phi_{exp}(x)=1\). Moreover, \(P\) runs in \(O(n)\) time where \(n=|exp|\). More-moreover \(P\) uses \(O(1)\) space and make single pass - deterministic finite automaton.

Heart of proof

For regexp \(exp\), \(exp[0]\) matches \(x\) if and only if \(exp\) matches \(x0\).

Example 1: \(exp=10011|100\), \(exp[0]=10\), \(exp[1]=1001\).

Example 2: \(exp=(100)*11|100\), \(exp[1]=(100)*1\), \(exp[0]=10\). \((exp[1])[0] = \emptyset\)

Question: If \(exp = (011010)^*\), what is \(exp[0]\)?

a. \((01101)^*\)
b. \((01101) | (011010)^*\)
c. \((011010)^*(01101)\)
d. \((01101)(011010)^*\)

Heart of proof

For regexp \(exp\), \(exp[0]\) matches \(x\) if and only if \(exp\) matches \(x0\).

Example 1: \(exp=10011|100\), \(exp[0]=10\), \(exp[1]=1001\).

Example 2: \(exp=(100)*11|100\), \(exp[1]=(100)*1\), \(exp[0]=10\). \((exp[1])[0] = \emptyset\)

Question: If \(exp = (011010)^*\), what is \(exp[0]\)?

a. \((01101)^*\)
b. \((01101) | (011010)^*\)
c. \((011010)^*(01101)\)
d. \((01101)(011010)^*\)

Heart of proof

For regexp \(exp\), \(exp[0]\) matches \(x\) if and only if \(exp\) matches \(x0\).

Example 1: \(exp=10011|100\), \(exp[0]=10\), \(exp[1]=1001\).

Example 2: \(exp=(100)*11|100\), \(exp[1]=(100)*1\), \(exp[0]=10\). \((exp[1])[0] = \emptyset\)

Lemma: For every regular expression \(exp\) and \(\sigma \in \{0,1\}\), can find expression \(exp[\sigma]\) that matches \(x\) iff \(exp\) matches \(x\sigma\). Moreover, \(C(exp[\sigma]) \leq C(exp)\).*

Proof by recursive algorithm in text

Linear time alg for regular expresison

Algorithm \(MATCH\):

  • Input: \(exp\), \(x\in \{0,1\}^n\).

  • Operation:

    1. If \(|x|=0\), return \(1\) iff \(exp\) contains ""

    2. Return \(MATCH(exp[x_{n-1}],x_0\cdots x_{n-1})\).

Correctness proof by induction:

\(MATCH(exp,x_0 \cdots x_{n-2}x_{n-1})\) \(= MATCH(exp[x_{n-1}],x_0\cdots x_{n-2})\) (ind. hyp.) \(= \Phi_{\exp[x_{n-1}]}(x_0\cdots x_{n-2})\) (lemma) \(=\Phi_{exp}(x_0 \cdots x_{n-2} x_{n-1})\)

\(TIME(n) = TIME(n-1) + O(1)\).

Regular expressions and automata

Thm 3: \(F\) is regular if and only if it can be computed by a one-pass constant-space algorithm (i.e., DFA).

One direction follows from memoization: turing recursive algorithm to dynamic program.

Many tools to perform this conversion.

Corollary: \(\forall\) \(exp\) \(\exists\) \(\overline{exp}\) s.t. \(\Phi_{\overline{exp}} = \neg \Phi_{exp}\). can you see why?

Corollary: \(\forall\) \(exp',exp''\), \(\exists\) \(exp\) s.t. \(\Phi_{exp} = \Phi_{exp'} \wedge \Phi_{exp''}\). can you see why?

\(a \wedge b = \overline{\overline{a} \vee \overline{b}}\)

Corollary: Can add \(\neg\) and \(\wedge\) syntactic sugar to reg exps.

Semantic properties

Thm: There is algorithm \(EQ(exp',exp'')\) that outputs \(1\) iff \(\Phi_{exp'}=\Phi_{exp''}\).

Proof: Given \(exp',exp''\), can build \(exp\) s.t. \(\Phi_{exp}(x)=1\) iff \(\Phi_{exp'}(x) \neq \Phi_{exp''}(x)\). (Can express \(\neq\) using \(\wedge,\vee,\neg\).)

Reduces to deciding emptiness: Given \(exp\), is \(\Phi_{exp} \equiv 0\).

Corollary: Decide semantic properties of finite state machines.

Non regular functions

Pumping lemma: (informal) If \(w\) matching \(exp\) is long enough can repeat substring of \(w\) arbitrarily many times and still match.

Proof idea: You can’t match a long string without using \(*\) operator.

Pumping Lemma: If \(\Phi_{exp}(w)=1\) and \(|w| \gt 2|exp|\) then \(w=xyz\) with \(|xy| \leq n_0\), \(|y| \geq 1\) and \(\Phi(xy^kz)=1\) for all \(k\in \mathbb{N}\).

Example: \(PAREN: \{ \langle, \rangle \}^* \rightarrow \{0,1\}\)

Context Free Grammars

\[ \begin{aligned} \langle letter \rangle & ::= a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z \\ \langle variable \rangle & ::= \langle letter \rangle \;\;|\;\; \langle letter \rangle \langle variable \rangle \\ \langle expression \rangle & ::= \langle variable \rangle \\ & | \hspace{7mm} \langle abstraction \rangle \\ & | \hspace{7mm} \langle application \rangle \\ \langle abstraction \rangle & ::= \lambda \langle variable \rangle . \langle expression \rangle \\ \langle application \rangle & ::= \left(\langle expression \rangle \langle expression \rangle\right) \end{aligned}\]

Question: Let \(F:\{a-z,(,),.,\)\(\} \rightarrow \{0,1\}\) be the function such that \(F(x)=1\) if \(x\) can be generated from \(\langle expression \rangle\) using grammar above. Describe \(F\) in words.

Context Free Grammars

\[ \begin{aligned} \langle letter \rangle & ::= a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z \\ \langle variable \rangle & ::= \langle letter \rangle \;\;|\;\; \langle letter \rangle \langle variable \rangle \\ \langle expression \rangle & ::= \langle variable \rangle \\ & | \hspace{7mm} \langle abstraction \rangle \\ & | \hspace{7mm} \langle application \rangle \\ \langle abstraction \rangle & ::= \lambda \langle variable \rangle . \langle expression \rangle \\ \langle application \rangle & ::= \left(\langle expression \rangle \langle expression \rangle\right) \end{aligned}\]

Question: Let \(F:\{a-z,(,),.,\)\(\} \rightarrow \{0,1\}\) be the function such that \(F(x)=1\) if \(x\) can be generated from \(\langle expression \rangle\) using grammar above. Describe \(F\) in words.

Another example

In practice: Not just decide if \(x\) is generated by \(G\), but also produce the sequence of derivations known as parse tree.

CFG’s: What you need to know

  • If \(F\) is regular then \(F\) is CFG.

  • If \(F\) is CFG then \(F\) is computable. (recursive algorithm following definition)

  • Neither of those is “if and only if”

  • Can decide emptiness for CFG.

  • Cannot decide equivalence: \(EQ(G,G')=1\) iff \(\Phi_G = \Phi_{G'}\) is uncomputable

Equivalence for CFG’s is uncomputable

Thm: Let \(EQCFG:\{0,1\}^* \rightarrow \{0,1\}\) such that

\[EQCFG(G,G') = \begin{cases}1 & \Phi_G = \Phi_{G'} \\ 0 & \text{otherwise} \end{cases}\]
then \(EQCFG\) is uncomputable.

Lemma: Let \(F(x)=1\) iff \(x = u\|v^R\) such that \(u \neq v\). Then \(F\) is CFG.

Proof:

\[\begin{aligned}\langle different \rangle & ::= 0 \langle something \rangle 1\!\!\! &&|\;\; 1 \langle something \rangle 0 && \\ \langle something \rangle &::= 0 \langle something \rangle 0 \!\!\! &&|\;\; 1 \langle something \rangle 1 \!\!\! &&|\;\; \langle different \rangle \;\;\; | \;\; \| \\ \langle start \rangle & ::= \;\langle different \rangle \!\!\! &&|\;\; 0 \langle start \rangle 0\!\!\! &&|\;\; 1 \langle start \rangle 1 & \end{aligned} \]

Equivalence for CFG’s is uncomputable

Thm: Let \(EQCFG:\{0,1\}^* \rightarrow \{0,1\}\) such that \(EQCFG(G,G') = 1\) iff \(\Phi_G = \Phi_{G'}\). Then \(EQCFG\) is uncomputable.

Proof: Let \(P\) be NAND++ program, and \(x\) be input. A computation history of \(P\) on \(x\) is the sequence \(\sigma_0,\ldots,\sigma_{t-1}\) such that \(\sigma_i\) encodes configuration of \(P(x)\) at step \(i\).

We say \(\sigma_0,\ldots, \sigma_{t-1}\) is SENSICAL if \(\alpha_0\) is starting, \(\sigma_{t-1}\) is halting and every \(\sigma_i\) has the right format (string of symbols in \(\Sigma_{small}=\{0,1\}^a\) with a single symbol in \(\Sigma_{big} = \{0,1\}^{a+b}\))

CLAIM: There is regexp to decide if \(\sigma_0,\ldots, \sigma_{t-1}\) is SENSICAL.

We say that \(\sigma_0,\ldots,\sigma_{t-1}\) is VALID if for every \(i\), \(\sigma_{i+1}^\top = NEXT_P(\sigma_i^\top)\) where \(\cdot^\top\) means that we reverse strings for odd \(i\).

CLAIM: There is context free language to decide if \(\sigma_0,\ldots,\sigma_{t-1}\) is INVALID.

\(P\) does not halt on \(x\) iff every SENSICAL configuration is INVALID.

Take away point on semantic properties

ModelHaltingEmptinessEquivalence
Regular ExpressionsDecidableDecidableDecidable
Context Free GrammarsDecidableDecidableUndecidable
Turing complete modelsUndecidableUndecidableUndecidable
Opps, you cannot play draw N guess with this browser!