\renewcommand{}{}

N or Space for next slide, P for previous slide, F for full screen, M for menu

Lec 19: Randomized Algorithms

Boaz Barak (instructor), Brian Sapozhnikov (Head TF), Albert Chalom, Alexis Ross, Charles O’Mara, Daniel Chen, David Li , Jambay Kinley, Jane Ahn, John Shen, Josh Seides, Laura Pierson, Pranay Tankala, Raymond Lin, Theresa Nguyen, Wanqian Yang, William Fu, Hikari Sorensen (Extension TF), Eric Lu (Patel Fellow)

Laptops/phones only on last 5 rows please.

Announcements

Coding assignment Sat November 10th
Midterm details
Pset 9 due Tuesday, November 20, 2018
Office hours: go early in pset week
Will say when it’s fine to miss out details
Reductions template. (noites)

Randomized algorithms

Def: $F:\{0,1\}^* \rightarrow \{0,1\}$ is in BPP if there is randomized polynomial time algorithm $A$ s.t.
\[\mathbb{P}_{r\in \{0,1\}^m}[ A(x;r)=1] \geq 2/3\]

Last time: $2SAT \in BPP$.

Today:

$MAXCUT_{1/2} \in BPP$.
$3SAT \in BPTIME(O(\sqrt{3}^n))$
Random walks
Randomness in computation
Pseudorandom generators

Question

Suppose that $X_0$ and $X_1$ are i.i.d, $\mathbb{P}[X_i=1]=\mathbb{P}[X_i=0]=1/2$
Let $Y = \begin{cases}1 & X_0 \neq X_1 \\ 0 & X_0 = X_1 \end{cases}$
What is $\mathbb{E}[Y]$?

a. $1/8$

b. $1/4$

c. $1/2$

d. $1/3$

Question

Suppose that $X_0$ and $X_1$ are i.i.d, $\mathbb{P}[X_i=1]=\mathbb{P}[X_i=0]=1/2$
Let $Y = \begin{cases}1 & X_0 \neq X_1 \\ 0 & X_0 = X_1 \end{cases}$
What is $\mathbb{E}[Y]$?

a. $1/8$

b. $1/4$

c. $1/2$

d. $1/3$

Maximum cut

Thm: $\exists$ randomized polynomial time alg that on input $G=(V,E)$ outputs $S\subseteq V$ s.t. $cut(S) \geq |E|/2$ with probability at least $1-2^{-1000}$.

Proportion: If you ran billion experiments every second since the birth of the universe, you would never see a $2^{-100}$ probability event.

Lemma 1: $\exists$ … s.t. $\mathbb{E}[cut(S)] \geq \tfrac{|E|}{2}$.

Lemma 2: $\exists$ … s.t. probability at least $1/(2|E|)$.

Lemma 2 $\Rightarrow$ Thm: Repeat $2000|E|$ times. Then $\mathbb{P}[\text{fail}] \leq (1-\tfrac{1}{2|E|})^{2000|E|} \leq 2^{-1000}$

Use $(1-\tfrac{1}{k})^k\leq \tfrac{1}{e} \sim \tfrac{1}{2.818}$

Pf of Lemma 1: Linearity of expectation: for every $i,j$, $\mathbb{E}[CUT_{i,j}]=\mathbb{P}[x_i \neq x_j]=1/2$.

Expectation to probability

Lemma: If $X \in \{ 0,1,2,\ldots, m\}$ and $\mathbb{E}[X] \geq m/2$ then $\mathbb{P}[X \geq \alpha] \geq 1/(2m)$.

Exercise

Expectation to probability

Lemma: If $X \in \{ 0,1,2,\ldots, m\}$ and $\mathbb{E}[X] \geq m/2$ then $\mathbb{P}[X \geq \alpha] \geq 1/(2m)$.

Expectation to probability

Lemma: If $X \in \{ 0,1,2,\ldots, m\}$ and $\mathbb{E}[X] \geq m/2$ then $\mathbb{P}[X \geq \alpha] \geq 1/(2m)$.

Proof:

Contribution left: $\leq (1-\tfrac{1}{2m})\lfloor\tfrac{m}{2}\rfloor$ Contribution right: $\leq \tfrac{1}{2m}m$

Expectation to probability

Lemma: If $X \in \{ 0,1,2,\ldots, m\}$ and $\mathbb{E}[X] \geq m/2$ then $\mathbb{P}[X \geq \alpha] \geq 1/(2m)$.

Proof:

Contribution left: $\leq (1-\tfrac{1}{2m})\lfloor\tfrac{m}{2}\rfloor$ Contribution right: $\leq \tfrac{1}{2m}m$

Expectation $\lt \lfloor\tfrac{m}{2}\rfloor +0.5 = \tfrac{m}{2}$!

Amplification

WalkSAT

Input: 3CNF $\varphi$ on $n$ variables.

Operation:

Repeat $T$ $\color{green}{(=100 \cdot 3^{n/2})}$ times:

(START) Pick $x\sim \{0,1\}^n$ and repeat $S$ $\color{green}{(=n/2)}$ times:
1. If $x$ satisfies $\varphi$ return $x$.
2. Pick random unsatisfied clause $(\neg)x_i \vee (\neg)x_j \vee (\neg)x_k$.
3. (WALK) Flip at random one of $x_i,x_j,x_k$.

Lemma 1: In (START), let $A=\{ dist(x,x^*) \leq n/2 \}$, then $\mathbb{P}[A] \geq 1/2$.

Lemma 2: In (WALK), if not done then $\mathbb{P}[\text{dist reduces}] \geq 1/3$. conditioned on any choice of $x,x^*$.

Analysis: Let $W_i$ be event that either finished or reduce the distance by one in step $i$ of WALK. Then we win if $A \cap W_1 \cap W_2 \cap \cdots \cap W_{n/2}$ occurs.

$\mathbb{P}[A]$ $\mathbb{P}[W_1|A]$ $\mathbb{P}[W_2|A \cap W_1]$ $\cdots$ $\mathbb{P}[W_{n/2}|W_{n/2-1}\cap \cdots \cap W_1 \cap A]$

$\geq \tfrac{1}{2}$ $\tfrac{1}{3}$ $\tfrac{1}{3}$ $\cdots$ $\tfrac{1}{3}$ $\sim 3^{-n/2}$

WalkSAT

Lemma 1: For every $x^*$, if $x \sim \{0,1\}^n$ then $\mathbb{P}[ dist(x,x^*) \leq n/2] \geq \tfrac{1}{2}$.

Proof:

Stronger analysis

Find $p$ maximizing product of $\mathbb{P}[ dist(x^*,x) \leq pn](1/3)^{pn}$

$\mathbb{P}[ dist(x,x^*) = p n ] \sim \binom{n}{pn}$ $\approx 2^{H(p)n}$

( $H(p)=p\log(1/p)+(1-p)\log(1/(1-p))$ )

Maximize $H(p)-1-p\log 3$.

$p=1/4$ is maximizer, probability is $(2/3)^n$.

Random walks

“Out of fear of losing … Kleiner Perkins had just paid 25 million … for a stake in a new company. Google.com consisted of a pair of Stanford graduate students who had a piece of software that might or might not make it easier to search the Internet … With companies called Google going for $75 million, Clark did not see much point in selling air: everyone else was already doing that.”, Michael Lewis, “The New New Thing”, 1999

Random walks

Thm: There is an $O(\log n)$ memory $poly(n)$ time randomized algorithm that given $(G,s,t)$ determines if $t$ is connected to $s$.

Pf: Just take random walk from $s$. If don’t reach $t$ after $100n^4$ steps then declare “unconnected”.

Claim: Let $X_{i,j}$ be number of steps taken from $i$ until we first reach $j$. Then for $i\neq j$, $\mathbb{E}[X_{i,j}] \leq 2n^3 \cdot dist(i,j)$.

Pf: By induction on $d \geq 1$. Below base case: $d=0$ Commute time $C=\max_i\mathbb{E}[X_{i,i}] \leq n^2$.

Base case: $d=1$: If $j \sim k$ and $j$ has degree $D$ then

$\mathbb{E}[X_{j,k}] \leq \color{green}{\tfrac{1}{D}}\cdot \color{red}{1} + \color{green}{(1-\tfrac{1}{D})\tfrac{1}{D}}\color{red}{(C+1)} +\color{green}{(1-\tfrac{1}{D})^2\tfrac{1}{D}}\color{red}{(2C+1)} + \cdots$

$\mathbb{E}[X_{j,k}] \leq (C+1) \cdot \mathbb{E}[Geom(\tfrac{1}{D})] = (C+1)D$

Induction: If $i \rightsquigarrow j \rightarrow k$ then $\mathbb{E}[X_{i,k}] \leq \mathbb{E}[X_{i,j}] + \mathbb{E}[X_{j,k}]$

$\leq 2(d-1)n^3 + 2n^3$

Commute time

Lemma: For every vertex $i$ in $G$, $\mathbb{E}[X_{i,i}]$ (expected time till walk from $i$ returns) is $\leq n^2$.

Proof: Let $C$ be the expectation, and think of a very long walk of $N$ steps on the graph. $i$ will appear in $\approx N/C$ steps.

If $D$ is dist of random vertex in walk, then $\mathbb{P}[D=i] = \tfrac{1}{C}$.

But $D$ must be stationary, which means $\mathbb{P}[D=i] = \tfrac{deg(i)}{\sum_j deg(j)} \geq 1/n^2$.

The Probabilistic Method

Goal: Prove that object $G$ exists that has property $P$.

Approach: Set up probabilistic experiment $\mathbb{G}$. Prove that $\mathbb{P}_{G \sim \mathbb{G}}[G \text{ has } P]\gt0$.

Usually: $P$ means that $G$ satisfies condition $1$, condition $2$, $\ldots$, condition $N$.

Define $B_i$ = “$G$ fails to satisfy condition $i$”

Prove $\mathbb{P}[B_i] \lt \tfrac{1}{100N}$

Conclude $\mathbb{P}[ \cup_i B_i ] \lt N \cdot \tfrac{1}{100N} \lt 1$.

Example: Hard functions

Thm: There exist $F:\{0,1\}^n \rightarrow \{0,1\}$ such that $F \not\in SIZE(2^n/(10n))$.

Proof: Choose $F$ at random. Think of $F \in \{0,1\}^{2^n}$

For NAND prog $P$, define $B_P = \{ \text{$P$ computes $F$} \}$.

$B_P$ holds if $F(x)=P(x)$ for every $x\in \{0,1\}^n$.

$\mathbb{P}[B_P] = \left(\tfrac{1}{2}\right)^{2^n}$

No. NAND progs of size $\leq S$ $\;\;\;\;\color{red}{\lt}\;\;\;\;$ $2^{10S \log S}$

If $2^{2^n} \gg 2^{10 S \log S}$ then $\mathbb{P}[\cup_S B_P] \lt \tfrac{2^{10 S \log S}}{2^{2^{n}}} \ll 1$.

Randomness in Computation

Distributed computing

Alice: Has string $x\in \{0,1\}^n$

Bob: Has string $y\in \{0,1\}^n$

Compare if $x=y$?

Deterministically: Send $x,y$ ($n$ bits)

Probabilistically: Choose $r \sim \{0,1\}^n$. Compare $\sum_{i=0}^{n-1}x_i r_i \mod 2$. (one bit)

If $x=y$ then output “equal” with probability $1$.

If $x \neq y$ then $\mathbb{P}[ \sum_{i=0}^{n-1} x_i r_i \mod 2 \neq \sum_{i=0}^{n-1} y_i r_i \mod 2 ] = 1/2$

Analysis of equality test

Lemma: If $x \neq y$ then $\mathbb{P}[ \sum_i x_i r_i \mod 2 \neq \sum_i x_i r_i \mod 2 ]=1/2$.

Proof: $S \neq S'$ is same as $S+ S' \mod 2 = 1$.

$\sum_{i=0}^{n-1} x_i r_i + \sum_{i=0}^{n-1} y_i r_i = 1 \mod 2$ $\Leftrightarrow \sum_{i=0}^{n-1} (x_i+y_i) r_i = 1 \mod 2$

Sketching

Alice: Has $x\in \mathbb{R}^n$

Bob: Has $y\in \mathbb{R}^n$

Goal: Estimate $\|x-y\|^2 = \sum_i (x_i-y_i)^2$.

Tool: Choose $r\sim \{ \pm 1 \}^n$, share $\sum_i r_i x_i$, $\sum_i r_i y_i$.

Lemma: For $v\in \mathbf{R}^n$, $\mathbb{E}[ (\sum_i r_i v_i )^2 ] = \sum v_i^2 = \|v\|^2$.

Proof similar to drunkard’s walk analysis.

Sublinear algorithms

Estimate properties from samples.

Load balancing

Hash tables
Distributed data structures
Content distribution networks
Distributed computation
Symmetry breaking

Cryptography

No secrets without randomness

Our focus

Is BPP=P?

Pseudorandom generators

BPP and NP

Thm: If $P=NP$ then $BPP=P$.

Proof idea: Let $P$ be probabilistic algorithm using $m$ coins and success $1-2^{-k}$.

Define $Q(x) = \exists_{s_0,\ldots,s_{10m/k}} \forall_{r\in \{0,1\}^m} P(x;s_i \oplus r)=1$.

Let $R_x \subseteq \{0,1\}^m$ be s.t. $P(x;r)=1$ for all $r\in R_x$.

If $|R_x|\lt k/(10m)$ then $Q(x)=0$.

If $|R_x| \gt (1-2^{-k})2^m$ then for every $y\in \{0,1\}^m$, $\mathbb{P}[ y\not\in s \oplus R_x ] \lt 2^{-k}$.