-
Tight Space Lower Bound for Pseudo-Deterministic Approximate Counting
Authors:
Ofer Grossman,
Meghal Gupta,
Mark Sellke
Abstract:
We investigate one of the most basic problems in streaming algorithms: approximating the number of elements in the stream. In 1978, Morris famously gave a randomized algorithm achieving a constant-factor approximation error for streams of length at most N in space $O(\log \log N)$. We investigate the pseudo-deterministic complexity of the problem and prove a tight $Ω(\log N)$ lower bound, thus res…
▽ More
We investigate one of the most basic problems in streaming algorithms: approximating the number of elements in the stream. In 1978, Morris famously gave a randomized algorithm achieving a constant-factor approximation error for streams of length at most N in space $O(\log \log N)$. We investigate the pseudo-deterministic complexity of the problem and prove a tight $Ω(\log N)$ lower bound, thus resolving a problem of Goldwasser-Grossman-Mohanty-Woodruff.
△ Less
Submitted 5 July, 2023; v1 submitted 3 April, 2023;
originally announced April 2023.
-
Beyond Alice and Bob: Improved Inapproximability for Maximum Independent Set in CONGEST
Authors:
Yuval Efron,
Ofer Grossman,
Seri Khoury
Abstract:
By far the most fruitful technique for showing lower bounds for the CONGEST model is reductions to two-party communication complexity. This technique has yielded nearly tight results for various fundamental problems such as distance computations, minimum spanning tree, minimum vertex cover, and more.
In this work, we take this technique a step further, and we introduce a framework of reductions…
▽ More
By far the most fruitful technique for showing lower bounds for the CONGEST model is reductions to two-party communication complexity. This technique has yielded nearly tight results for various fundamental problems such as distance computations, minimum spanning tree, minimum vertex cover, and more.
In this work, we take this technique a step further, and we introduce a framework of reductions to $t$-party communication complexity, for every $t\geq 2$. Our framework enables us to show improved hardness results for maximum independent set. Recently, Bachrach et al.[PODC 2019] used the two-party framework to show hardness of approximation for maximum independent set. They show that finding a $(5/6+ε)$-approximation requires $Ω(n/\log^6 n)$ rounds, and finding a $(7/8+ε)$-approximation requires $Ω(n^2/\log^7 n)$ rounds, in the CONGEST model where $n$ in the number of nodes in the network.
We improve the results of Bachrach et al. by using reductions to multi-party communication complexity. Our results:
(1) Any algorithm that finds a $(1/2+ε)$-approximation for maximum independent set in the CONGEST model requires $Ω(n/\log^3 n)$ rounds.
(2) Any algorithm that finds a $(3/4+ε)$-approximation for maximum independent set in the CONGEST model requires $Ω(n^2/\log^3 n)$ rounds.
△ Less
Submitted 27 May, 2020; v1 submitted 16 March, 2020;
originally announced March 2020.
-
Pseudo-deterministic Streaming
Authors:
Shafi Goldwasser,
Ofer Grossman,
Sidhanth Mohanty,
David P. Woodruff
Abstract:
A pseudo-deterministic algorithm is a (randomized) algorithm which, when run multiple times on the same input, with high probability outputs the same result on all executions. Classic streaming algorithms, such as those for finding heavy hitters, approximate counting, $\ell_2$ approximation, finding a nonzero entry in a vector (for turnstile algorithms) are not pseudo-deterministic. For example, i…
▽ More
A pseudo-deterministic algorithm is a (randomized) algorithm which, when run multiple times on the same input, with high probability outputs the same result on all executions. Classic streaming algorithms, such as those for finding heavy hitters, approximate counting, $\ell_2$ approximation, finding a nonzero entry in a vector (for turnstile algorithms) are not pseudo-deterministic. For example, in the instance of finding a nonzero entry in a vector, for any known low-space algorithm $A$, there exists a stream $x$ so that running $A$ twice on $x$ (using different randomness) would with high probability result in two different entries as the output.
In this work, we study whether it is inherent that these algorithms output different values on different executions. That is, we ask whether these problems have low-memory pseudo-deterministic algorithms. For instance, we show that there is no low-memory pseudo-deterministic algorithm for finding a nonzero entry in a vector (given in a turnstile fashion), and also that there is no low-dimensional pseudo-deterministic sketching algorithm for $\ell_2$ norm estimation. We also exhibit problems which do have low memory pseudo-deterministic algorithms but no low memory deterministic algorithm, such as outputting a nonzero row of a matrix, or outputting a basis for the row-span of a matrix.
We also investigate multi-pseudo-deterministic algorithms: algorithms which with high probability output one of a few options. We show the first lower bounds for such algorithms. This implies that there are streaming problems such that every low space algorithm for the problem must have inputs where there are many valid outputs, all with a significant probability of being outputted.
△ Less
Submitted 26 November, 2019;
originally announced November 2019.
-
Strategy-Stealing is Non-Constructive
Authors:
Greg Bodwin,
Ofer Grossman
Abstract:
In many combinatorial games, one can prove that the first player wins under best play using a simple but non-constructive argument called strategy-stealing. This work is about the complexity behind these proofs: how hard is it to actually find a winning move in a game, when you know by strategy-stealing that one exists? We prove that this problem is PSPACE-hard already for Minimum Poset Games and…
▽ More
In many combinatorial games, one can prove that the first player wins under best play using a simple but non-constructive argument called strategy-stealing. This work is about the complexity behind these proofs: how hard is it to actually find a winning move in a game, when you know by strategy-stealing that one exists? We prove that this problem is PSPACE-hard already for Minimum Poset Games and Symmetric Maker-Maker Games, which are simple classes of games that capture two of the main types of strategy-stealing arguments in the current literature.
△ Less
Submitted 15 November, 2019;
originally announced November 2019.
-
Broadcast Congested Clique: Planted Cliques and Pseudorandom Generators
Authors:
Lijie Chen,
Ofer Grossman
Abstract:
We develop techniques to prove lower bounds for the BCAST(log n) Broadcast Congested Clique model (a distributed message passing model where in each round, each processor can broadcast an O(log n)-sized message to all other processors). Our techniques are built to prove bounds for natural input distributions. So far, all lower bounds for problems in the model relied on constructing specifically ta…
▽ More
We develop techniques to prove lower bounds for the BCAST(log n) Broadcast Congested Clique model (a distributed message passing model where in each round, each processor can broadcast an O(log n)-sized message to all other processors). Our techniques are built to prove bounds for natural input distributions. So far, all lower bounds for problems in the model relied on constructing specifically tailored graph families for the specific problem at hand, resulting in lower bounds for artificially constructed inputs, instead of natural input distributions.
One of our results is a lower bound for the directed planted clique problem. In this problem, an input graph is either a random directed graph (each directed edge is included with probability 1/2), or a random graph with a planted clique of size k. That is, k randomly chosen vertices have all of the edges between them included, and all other edges in the graph appear with probability 1/2. The goal is to determine whether a clique exists. We show that when k = n^(1/4 - eps), this problem requires a number of rounds polynomial in n.
Additionally, we construct a pseudo-random generator which fools the Broadcast Congested Clique. This allows us to show that every k round randomized algorithm in which each processor uses up to n random bits can be efficiently transformed into an O(k)-round randomized algorithm in which each processor uses only up to O(k log n) random bits, while maintaining a high success probability. The pseudo-random generator is simple to describe, computationally very cheap, and its seed size is optimal up to constant factors. However, the analysis is quite involved, and is based on the new technique for proving lower bounds in the model.
The technique also allows us to prove the first average case lower bound for the Broadcast Congested Clique, as well as an average-case time hierarchy.
△ Less
Submitted 19 May, 2019;
originally announced May 2019.
-
Algorithms for Noisy Broadcast under Erasures
Authors:
Ofer Grossman,
Bernhard Haeupler,
Sidhanth Mohanty
Abstract:
The noisy broadcast model was first studied in [Gallager, TranInf'88] where an $n$-character input is distributed among $n$ processors, so that each processor receives one input bit. Computation proceeds in rounds, where in each round each processor broadcasts a single character, and each reception is corrupted independently at random with some probability $p$. [Gallager, TranInf'88] gave an algor…
▽ More
The noisy broadcast model was first studied in [Gallager, TranInf'88] where an $n$-character input is distributed among $n$ processors, so that each processor receives one input bit. Computation proceeds in rounds, where in each round each processor broadcasts a single character, and each reception is corrupted independently at random with some probability $p$. [Gallager, TranInf'88] gave an algorithm for all processors to learn the input in $O(\log\log n)$ rounds with high probability. Later, a matching lower bound of $Ω(\log\log n)$ was given in [Goyal, Kindler, Saks; SICOMP'08].
We study a relaxed version of this model where each reception is erased and replaced with a `?' independently with probability $p$. In this relaxed model, we break past the lower bound of [Goyal, Kindler, Saks; SICOMP'08] and obtain an $O(\log^* n)$-round algorithm for all processors to learn the input with high probability. We also show an $O(1)$-round algorithm for the same problem when the alphabet size is $Ω(\mathrm{poly}(n))$.
△ Less
Submitted 2 August, 2018;
originally announced August 2018.
-
Reproducibility and Pseudo-Determinism in Log-Space
Authors:
Ofer Grossman,
Yang P. Liu
Abstract:
A curious property of randomized log-space search algorithms is that their outputs are often longer than their workspace. This leads to the question: how can we reproduce the results of a randomized log space computation without storing the output or randomness verbatim? Running the algorithm again with new random bits may result in a new (and potentially different) output.
We show that every pr…
▽ More
A curious property of randomized log-space search algorithms is that their outputs are often longer than their workspace. This leads to the question: how can we reproduce the results of a randomized log space computation without storing the output or randomness verbatim? Running the algorithm again with new random bits may result in a new (and potentially different) output.
We show that every problem in search-RL has a randomized log-space algorithm where the output can be reproduced. Specifically, we show that for every problem in search-RL, there are a pair of log-space randomized algorithms A and B where for every input x, A will output some string t_x of size O(log n), such that B when running on (x, t_x) will be pseudo-deterministic: that is, running B multiple times on the same input (x, t_x) will result in the same output on all executions with high probability. Thus, by storing only O(log n) bits in memory, it is possible to reproduce the output of a randomized log-space algorithm.
An algorithm is reproducible without storing any bits in memory (i.e., |t_x|=0) if and only if it is pseudo-deterministic. We show pseudo-deterministic algorithms for finding paths in undirected graphs and Eulerian graphs using logarithmic space. Our algorithms are substantially faster than the best known deterministic algorithms for finding paths in such graphs in log-space.
The algorithm for search-RL has the additional property that its output, when viewed as a random variable depending on the randomness used by the algorithm, has entropy O(log n).
△ Less
Submitted 11 March, 2018;
originally announced March 2018.
-
Improved Deterministic Distributed Construction of Spanners
Authors:
Ofer Grossman,
Merav Parter
Abstract:
Graph spanners are fundamental graph structures with a wide range of applications in distributed networks. We consider a standard synchronous message passing model where in each round $O(\log n)$ bits can be transmitted over every edge (the CONGEST model).
The state of the art of deterministic distributed spanner constructions suffers from large messages. The only exception is the work of Derbel…
▽ More
Graph spanners are fundamental graph structures with a wide range of applications in distributed networks. We consider a standard synchronous message passing model where in each round $O(\log n)$ bits can be transmitted over every edge (the CONGEST model).
The state of the art of deterministic distributed spanner constructions suffers from large messages. The only exception is the work of Derbel et al. '10, which computes an optimal-sized $(2k-1)$-spanner but uses $O(n^{1-1/k})$ rounds.
In this paper, we significantly improve this bound. We present a deterministic distributed algorithm that given an unweighted $n$-vertex graph $G = (V, E)$ and a parameter $k > 2$, constructs a $(2k-1)$-spanner with $O(k \cdot n^{1+1/k})$ edges within $O(2^{k} \cdot n^{1/2 - 1/k})$ rounds for every even $k$. For odd $k$, the number of rounds is $O(2^{k} \cdot n^{1/2 - 1/(2k)})$. For the weighted case, we provide the first deterministic construction of a $3$-spanner with $O(n^{3/2})$ edges that uses $O(\log n)$-size messages and $\widetilde{O}(1)$ rounds. If the nodes have IDs in $[1, Θ(n)]$, then the algorithm works in only $2$ rounds!
△ Less
Submitted 12 August, 2017; v1 submitted 3 August, 2017;
originally announced August 2017.
-
Pseudo-deterministic Proofs
Authors:
Shafi Goldwasser,
Ofer Grossman,
Dhiraj Holden
Abstract:
We introduce pseudo-deterministic interactive proofs (psdAM): interactive proof systems for search problems where the verifier is guaranteed with high probability to output the same output on different executions. As in the case with classical interactive proofs, the verifier is a probabilistic polynomial time algorithm interacting with an untrusted powerful prover.
We view pseudo-deterministic…
▽ More
We introduce pseudo-deterministic interactive proofs (psdAM): interactive proof systems for search problems where the verifier is guaranteed with high probability to output the same output on different executions. As in the case with classical interactive proofs, the verifier is a probabilistic polynomial time algorithm interacting with an untrusted powerful prover.
We view pseudo-deterministic interactive proofs as an extension of the study of pseudo-deterministic randomized polynomial time algorithms: the goal of the latter is to find canonical solutions to search problems whereas the goal of the former is to prove that a solution to a search problem is canonical to a probabilistic polynomial time verifier. Alternatively, one may think of the powerful prover as aiding the probabilistic polynomial time verifier to find canonical solutions to search problems, with high probability over the randomness of the verifier. The challenge is that pseudo-determinism should hold not only with respect to the randomness, but also with respect to the prover: a malicious prover should not be able to cause the verifier to output a solution other than the unique canonical one.
△ Less
Submitted 14 June, 2017;
originally announced June 2017.
-
Amplification and Derandomization Without Slowdown
Authors:
Ofer Grossman,
Dana Moshkovitz
Abstract:
We present techniques for decreasing the error probability of randomized algorithms and for converting randomized algorithms to deterministic (non-uniform) algorithms. Unlike most existing techniques that involve repetition of the randomized algorithm and hence a slowdown, our techniques produce algorithms with a similar run-time to the original randomized algorithms. The amplification technique i…
▽ More
We present techniques for decreasing the error probability of randomized algorithms and for converting randomized algorithms to deterministic (non-uniform) algorithms. Unlike most existing techniques that involve repetition of the randomized algorithm and hence a slowdown, our techniques produce algorithms with a similar run-time to the original randomized algorithms. The amplification technique is related to a certain stochastic multi-armed bandit problem. The derandomization technique - which is the main contribution of this work - points to an intriguing connection between derandomization and sketching/sparsification.
We demonstrate the techniques by showing applications to Max-Cut on dense graphs, approximate clique on graphs that contain a large clique, constraint satisfaction problems on dense bipartite graphs and the list decoding to unique decoding problem for the Reed-Muller code.
△ Less
Submitted 27 September, 2015;
originally announced September 2015.