\renewcommand{}{}
N or Space for next slide, P for previous slide, F for full screen, M for menu
Boaz Barak (instructor), Brian Sapozhnikov (Head TF), Albert Chalom, Alexis Ross, Charles O’Mara, Daniel Chen, David Li , Jambay Kinley, Jane Ahn, John Shen, Josh Seides, Laura Pierson, Pranay Tankala, Raymond Lin, Theresa Nguyen, Wanqian Yang, William Fu, Hikari Sorensen (Extension TF), Eric Lu (Patel Fellow)
Laptops/phones only on last 5 rows please.
Coding assignment Sat November 10th
Midterm details
Pset 9 due Tuesday, November 20, 2018
Office hours: go early in pset week
Will say when it’s fine to miss out details
Reductions template. (noites)
Def: \(F:\{0,1\}^* \rightarrow \{0,1\}\) is in BPP if there is randomized polynomial time algorithm \(A\) s.t.
\[\mathbb{P}_{r\in \{0,1\}^m}[ A(x;r)=1] \geq 2/3\]
Last time: \(2SAT \in BPP\).
Today:
\(MAXCUT_{1/2} \in BPP\).
\(3SAT \in BPTIME(O(\sqrt{3}^n))\)
Random walks
Randomness in computation
Pseudorandom generators
Suppose that \(X_0\) and \(X_1\) are i.i.d, \(\mathbb{P}[X_i=1]=\mathbb{P}[X_i=0]=1/2\)
Let \(Y = \begin{cases}1 & X_0 \neq X_1 \\ 0 & X_0 = X_1 \end{cases}\)
What is \(\mathbb{E}[Y]\)?
a. \(1/8\)
b. \(1/4\)
c. \(1/2\)
d. \(1/3\)
Suppose that \(X_0\) and \(X_1\) are i.i.d, \(\mathbb{P}[X_i=1]=\mathbb{P}[X_i=0]=1/2\)
Let \(Y = \begin{cases}1 & X_0 \neq X_1 \\ 0 & X_0 = X_1 \end{cases}\)
What is \(\mathbb{E}[Y]\)?
a. \(1/8\)
b. \(1/4\)
c. \(1/2\)
d. \(1/3\)
Thm: \(\exists\) randomized polynomial time alg that on input \(G=(V,E)\) outputs \(S\subseteq V\) s.t. \(cut(S) \geq |E|/2\) with probability at least \(1-2^{-1000}\).
Proportion: If you ran billion experiments every second since the birth of the universe, you would never see a \(2^{-100}\) probability event.
Lemma 1: \(\exists\) … s.t. \(\mathbb{E}[cut(S)] \geq \tfrac{|E|}{2}\).
Lemma 2: \(\exists\) … s.t. probability at least \(1/(2|E|)\).
Lemma 2 \(\Rightarrow\) Thm: Repeat \(2000|E|\) times. Then \(\mathbb{P}[\text{fail}] \leq (1-\tfrac{1}{2|E|})^{2000|E|} \leq 2^{-1000}\)
Use \((1-\tfrac{1}{k})^k\leq \tfrac{1}{e} \sim \tfrac{1}{2.818}\)
Pf of Lemma 1: Linearity of expectation: for every \(i,j\), \(\mathbb{E}[CUT_{i,j}]=\mathbb{P}[x_i \neq x_j]=1/2\).
Lemma: If \(X \in \{ 0,1,2,\ldots, m\}\) and \(\mathbb{E}[X] \geq m/2\) then \(\mathbb{P}[X \geq \alpha] \geq 1/(2m)\).
Exercise
Lemma: If \(X \in \{ 0,1,2,\ldots, m\}\) and \(\mathbb{E}[X] \geq m/2\) then \(\mathbb{P}[X \geq \alpha] \geq 1/(2m)\).
Lemma: If \(X \in \{ 0,1,2,\ldots, m\}\) and \(\mathbb{E}[X] \geq m/2\) then \(\mathbb{P}[X \geq \alpha] \geq 1/(2m)\).
Proof:
Contribution left: \(\leq (1-\tfrac{1}{2m})\lfloor\tfrac{m}{2}\rfloor\) Contribution right: \(\leq \tfrac{1}{2m}m\)
Lemma: If \(X \in \{ 0,1,2,\ldots, m\}\) and \(\mathbb{E}[X] \geq m/2\) then \(\mathbb{P}[X \geq \alpha] \geq 1/(2m)\).
Proof:
Contribution left: \(\leq (1-\tfrac{1}{2m})\lfloor\tfrac{m}{2}\rfloor\) Contribution right: \(\leq \tfrac{1}{2m}m\)
Expectation \(\lt \lfloor\tfrac{m}{2}\rfloor +0.5 = \tfrac{m}{2}\)!
Input: 3CNF \(\varphi\) on \(n\) variables.
Operation:
Repeat \(T\) \(\color{green}{(=100 \cdot 3^{n/2})}\) times:
(START) Pick \(x\sim \{0,1\}^n\) and repeat \(S\) \(\color{green}{(=n/2)}\) times:
Lemma 1: In (START), let \(A=\{ dist(x,x^*) \leq n/2 \}\), then \(\mathbb{P}[A] \geq 1/2\).
Lemma 2: In (WALK), if not done then \(\mathbb{P}[\text{dist reduces}] \geq 1/3\). conditioned on any choice of \(x,x^*\).
Analysis: Let \(W_i\) be event that either finished or reduce the distance by one in step \(i\) of WALK. Then we win if \(A \cap W_1 \cap W_2 \cap \cdots \cap W_{n/2}\) occurs.
\(\mathbb{P}[A]\) \(\mathbb{P}[W_1|A]\) \(\mathbb{P}[W_2|A \cap W_1]\) \(\cdots\) \(\mathbb{P}[W_{n/2}|W_{n/2-1}\cap \cdots \cap W_1 \cap A]\)
\(\geq \tfrac{1}{2}\) \(\tfrac{1}{3}\) \(\tfrac{1}{3}\) \(\cdots\) \(\tfrac{1}{3}\) \(\sim 3^{-n/2}\)
Lemma 1: For every \(x^*\), if \(x \sim \{0,1\}^n\) then \(\mathbb{P}[ dist(x,x^*) \leq n/2] \geq \tfrac{1}{2}\).
Proof:
Find \(p\) maximizing product of \(\mathbb{P}[ dist(x^*,x) \leq pn](1/3)^{pn}\)
\(\mathbb{P}[ dist(x,x^*) = p n ] \sim \binom{n}{pn}\) \(\approx 2^{H(p)n}\)
( \(H(p)=p\log(1/p)+(1-p)\log(1/(1-p))\) )
Maximize \(H(p)-1-p\log 3\).
\(p=1/4\) is maximizer, probability is \((2/3)^n\).
“Out of fear of losing … Kleiner Perkins had just paid 25 million … for a stake in a new company. Google.com consisted of a pair of Stanford graduate students who had a piece of software that might or might not make it easier to search the Internet … With companies called Google going for $75 million, Clark did not see much point in selling air: everyone else was already doing that.”, Michael Lewis, “The New New Thing”, 1999
Thm: There is an \(O(\log n)\) memory \(poly(n)\) time randomized algorithm that given \((G,s,t)\) determines if \(t\) is connected to \(s\).
Pf: Just take random walk from \(s\). If don’t reach \(t\) after \(100n^4\) steps then declare “unconnected”.
Claim: Let \(X_{i,j}\) be number of steps taken from \(i\) until we first reach \(j\). Then for \(i\neq j\), \(\mathbb{E}[X_{i,j}] \leq 2n^3 \cdot dist(i,j)\).
Pf: By induction on \(d \geq 1\). Below base case: \(d=0\) Commute time \(C=\max_i\mathbb{E}[X_{i,i}] \leq n^2\).
Base case: \(d=1\): If \(j \sim k\) and \(j\) has degree \(D\) then
\(\mathbb{E}[X_{j,k}] \leq \color{green}{\tfrac{1}{D}}\cdot \color{red}{1} + \color{green}{(1-\tfrac{1}{D})\tfrac{1}{D}}\color{red}{(C+1)} +\color{green}{(1-\tfrac{1}{D})^2\tfrac{1}{D}}\color{red}{(2C+1)} + \cdots\)
\(\mathbb{E}[X_{j,k}] \leq (C+1) \cdot \mathbb{E}[Geom(\tfrac{1}{D})] = (C+1)D\)
Induction: If \(i \rightsquigarrow j \rightarrow k\) then \(\mathbb{E}[X_{i,k}] \leq \mathbb{E}[X_{i,j}] + \mathbb{E}[X_{j,k}]\)
\(\leq 2(d-1)n^3 + 2n^3\)
Lemma: For every vertex \(i\) in \(G\), \(\mathbb{E}[X_{i,i}]\) (expected time till walk from \(i\) returns) is \(\leq n^2\).
Proof: Let \(C\) be the expectation, and think of a very long walk of \(N\) steps on the graph. \(i\) will appear in \(\approx N/C\) steps.
If \(D\) is dist of random vertex in walk, then \(\mathbb{P}[D=i] = \tfrac{1}{C}\).
But \(D\) must be stationary, which means \(\mathbb{P}[D=i] = \tfrac{deg(i)}{\sum_j deg(j)} \geq 1/n^2\).
Goal: Prove that object \(G\) exists that has property \(P\).
Approach: Set up probabilistic experiment \(\mathbb{G}\). Prove that \(\mathbb{P}_{G \sim \mathbb{G}}[G \text{ has } P]\gt0\).
Usually: \(P\) means that \(G\) satisfies condition \(1\), condition \(2\), \(\ldots\), condition \(N\).
Define \(B_i\) = “\(G\) fails to satisfy condition \(i\)”
Prove \(\mathbb{P}[B_i] \lt \tfrac{1}{100N}\)
Conclude \(\mathbb{P}[ \cup_i B_i ] \lt N \cdot \tfrac{1}{100N} \lt 1\).
Thm: There exist \(F:\{0,1\}^n \rightarrow \{0,1\}\) such that \(F \not\in SIZE(2^n/(10n))\).
Proof: Choose \(F\) at random. Think of \(F \in \{0,1\}^{2^n}\)
For NAND prog \(P\), define \(B_P = \{ \text{$P$ computes $F$} \}\).
\(B_P\) holds if \(F(x)=P(x)\) for every \(x\in \{0,1\}^n\).
\(\mathbb{P}[B_P] = \left(\tfrac{1}{2}\right)^{2^n}\)
No. NAND progs of size \(\leq S\) \(\;\;\;\;\color{red}{\lt}\;\;\;\;\) \(2^{10S \log S}\)
If \(2^{2^n} \gg 2^{10 S \log S}\) then \(\mathbb{P}[\cup_S B_P] \lt \tfrac{2^{10 S \log S}}{2^{2^{n}}} \ll 1\).
Alice: Has string \(x\in \{0,1\}^n\)
Bob: Has string \(y\in \{0,1\}^n\)
Compare if \(x=y\)?
Deterministically: Send \(x,y\) (\(n\) bits)
Probabilistically: Choose \(r \sim \{0,1\}^n\). Compare \(\sum_{i=0}^{n-1}x_i r_i \mod 2\). (one bit)
If \(x=y\) then output “equal” with probability \(1\).
If \(x \neq y\) then \(\mathbb{P}[ \sum_{i=0}^{n-1} x_i r_i \mod 2 \neq \sum_{i=0}^{n-1} y_i r_i \mod 2 ] = 1/2\)
Lemma: If \(x \neq y\) then \(\mathbb{P}[ \sum_i x_i r_i \mod 2 \neq \sum_i x_i r_i \mod 2 ]=1/2\).
Proof: \(S \neq S'\) is same as \(S+ S' \mod 2 = 1\).
\(\sum_{i=0}^{n-1} x_i r_i + \sum_{i=0}^{n-1} y_i r_i = 1 \mod 2\) \(\Leftrightarrow \sum_{i=0}^{n-1} (x_i+y_i) r_i = 1 \mod 2\)
Alice: Has \(x\in \mathbb{R}^n\)
Bob: Has \(y\in \mathbb{R}^n\)
Goal: Estimate \(\|x-y\|^2 = \sum_i (x_i-y_i)^2\).
Tool: Choose \(r\sim \{ \pm 1 \}^n\), share \(\sum_i r_i x_i\), \(\sum_i r_i y_i\).
Lemma: For \(v\in \mathbf{R}^n\), \(\mathbb{E}[ (\sum_i r_i v_i )^2 ] = \sum v_i^2 = \|v\|^2\).
Proof similar to drunkard’s walk analysis.
Estimate properties from samples.
Hash tables
Distributed data structures
Content distribution networks
Distributed computation
Symmetry breaking
No secrets without randomness
Is BPP=P?
Thm: If \(P=NP\) then \(BPP=P\).
Proof idea: Let \(P\) be probabilistic algorithm using \(m\) coins and success \(1-2^{-k}\).
Define \(Q(x) = \exists_{s_0,\ldots,s_{10m/k}} \forall_{r\in \{0,1\}^m} P(x;s_i \oplus r)=1\).
Let \(R_x \subseteq \{0,1\}^m\) be s.t. \(P(x;r)=1\) for all \(r\in R_x\).
If \(|R_x|\lt k/(10m)\) then \(Q(x)=0\).
If \(|R_x| \gt (1-2^{-k})2^m\) then for every \(y\in \{0,1\}^m\), \(\mathbb{P}[ y\not\in s \oplus R_x ] \lt 2^{-k}\).