We typically have seminars on Wednesdays at noon in Malone 228. All seminar announcements will be sent to the theory mailing list.
Speaker: Samson Zhou
Affiliation: Purdue University
Title: Password Hashing and Graph Pebbling
Abstract: Although the passwords of users are no longer being stored, we show an offline attacker is compelled to crack all stolen passwords under current security recommendations. Memory hard functions have been proposed as moderately expensive cryptographic tools for password hashing. The cryptanalysis of these functions has focused on the cumulative memory complexity and the energy complexity of the function. The first metric measures how much memory users must commit to evaluating a function, while the second metric measures how much energy users must commit. We prove these evaluations reduce to pebbling games on graphs but show that a tool for exact cryptanalysis of functions is unlikely to exist. Nevertheless, we provide asymptotic upper and lower bounds on several families of functions, including Argon2i, the winner of the password hashing competition that is currently being considered for standardization by the Cryptography Form Research Group of the Internet Research Task Force.
Joint work with Jeremiah Blocki, Ben Harsha, Ling Ren
BIO:
Samson is a PhD candidate in the Department of Computer Science at Purdue University, under the supervision of Greg Frederickson and Elena Grigorescu. He received his undergraduate education at MIT, where he obtained a Bachelor’s in math and computer science, as well as a Master’s in computer science. He is a member of the Theory Group at Purdue and a winner of the Sigma Xi Research Awards Competition for graduate students in engineering. His current research interests are sublinear and approximation algorithms, with an emphasis on streaming algorithms.
Speaker: Sai Lakshmi Bhavana Obbattu
Affiliation: IISC Bangalore, India
Title: Privacy Amplification from Non-malleable Codes
The goal of a Privacy Amplification (PA) protocol is to allow two parties, who start out sharing a non-uniform secret ‘w’, to agree on a uniform secret ‘k’, in the presence of a computationally unbounded man-in-the-middle adversary. An interactive PA protocol is rated based on three parameters: 1) Number of rounds, 2) Entropy loss (entropy of w – |k|), and 3) Min-entropy requirement for w, while the asymptotically optimal parameters are 2, O(s) and O(s+log n) respectively (where s is the security parameter and n =|w|). There have been two popular approaches to solve this problem: one using use bit authentication protocols and the other using non-malleable extractors, but none of the prior protocols using these approaches had all asymptotically optimal parameters.
We give an alternate approach to solve the problem using Non-malleable Codes (NMCs). This approach results in a 8-round protocol with min-entropy requirement O(s+log n) and an entropy loss of O(s log s). Augmented NMCs with better parameters would result in optimal entropy loss of O(s). Our result is one of the first information theoretic applications of NMCs. In this talk, I will introduce NMCs and show connection of NMCs to PA.
In a concurrent and independent work, Xin Li gives a protocol with asymptotically optimal parameters based on non-malleable extractors. Because all known approaches have large hidden constants, exploring alternatives is necessary if we hope to get practical concrete parameters
The talk is based on:
Eshan Chattopadhyay, Bhavana Kanukurthi, Sai Lakshmi Bhavana Obbattu and Sruthi Sekar. Privacy Amplification from Non-malleable Codes. (eprint.iacr.org/2018/293)
Bhavana Kanukurthi, Sai Lakshmi Bhavana Obbattu and Sruthi Sekar. Non-malleable Randomness Encoders and their Applications (Eurocrypt 2018)
Bio: Sai Lakshmi Bhavana Obbattu is a doctoral student at Indian Institute of Science(IISc), Bangalore, advised by Dr. Bhavana Kanukurthi. Her publication venues include the Theory of Cryptography Conference (TCC) and Eurocrypt. Her TCC publication on Four-state Non-malleable Codes was invited to the Journal of Cryptology. She received her Integrated Dual Degree (B.Tech and M.Tech) from IIT(BHU), Varanasi. Her research interests include Non-malleable codes, Privacy Amplification and Applied Multi-party computation.
Speaker: Akash Kumar
Affiliation: Purdue University
Location: Malone 338 (note change of location)
Title:
Finding Minors in Sublinear time in Bounded degree graphs with (almost) optimal one-sided query complexity.
Abstract:
Let G be an undirected, bounded degree graph with n vertices. Fix a finite graph H, and suppose one must remove \varepsilon n edges from G to make it H-minor free (for some small constant \varepsilon > 0). We give an n^{1/2+o(1)}-time randomized procedure that, with high probability, finds an H-minor in such a graph. For an example application, suppose one must remove \varepsilon n edges from a bounded degree graph G to make it planar. This result implies an algorithm, with the same running time, that produces a K_{3,3} or K_5 minor in G. No sublinear time bound was known for this problem, prior to this result.
By the graph minor theorem, we get an analogous result for any minor-closed property. Up to n^{o(1)} factors, this resolves a conjecture of Benjamini-Schramm-Shapira (STOC 2008) on the existence of one-sided property testers for minor-closed properties. Furthermore, our algorithm is nearly optimal, by an \Omega(\sqrt{n}) lower bound of Czumaj et al (RSA 2014).
Prior to this work, the only graphs H for which non-trivial property testers were known for H-minor freeness are the following: H being a forest or a cycle (Czumaj et al, RSA 2014), K_{2,k}, (k\times 2)-grid, and the k-circus (Fichtenberger et al, Arxiv 2017).
(Joint work with C. Seshadhri and Andrew Stolman).
Welcome and Introductions
Speaker: Zhengzhong Jin
Affiliation: JHU
Title: Deterministic Document Exchange Protocols, and Almost Optimal Binary Codes for Edit Errors
Abstract:
We study two basic problems regarding edit error, i.e. document exchange and error correcting codes for edit errors (insdel codes). For message length n and edit error upper bound k, it is known that in both problems the optimal sketch size or the optimal number of redundant bits is Θ(k log n/k). However, known constructions are far from achieving these bounds.
We significantly improve previous results on both problems. For document exchange, we give an efficient deterministic protocol with sketch size O(k log^2 n/k). This significantly improves the previous best known deterministic protocol, which has sketch size O(k^2+k log^2 n). For binary insdel codes, we obtain the following results:
1. An explicit binary insdel code which encodes an n-bit message x against k errors with redundancy O(k log^2 n/k). In particular this implies an explicit family of binary insdel codes that can correct ε fraction of insertions and deletions with rate 1−O(ε log^2(1/ε))=1−\tilde {O}(ε).
2. An explicit binary insdel code which encodes an n-bit message x against k errors with redundancy O(k log n). This is the first explicit construction of binary insdel codes that has optimal redundancy for a wide range of error parameters k, and this brings our understanding of binary insdel codes much closer to that of standard binary error correcting codes.
In obtaining our results we introduce the notion of ε-self matching hash functions and ε-synchronization hash functions. We believe our techniques can have further applications in the literature.
Speaker: Marius Zimand
Affiliation: Towson University
Title: An operational characterization of mutual information in algorithmic information theory
Abstract: An operational interpretation of the concept of mutual information in the framework of Kolmogorov complexity has been elusive till now. We show that the mutual information of any pair of strings x and y is equal, up to logarithmic precision, to the length of the longest shared secret key that two parties, one having x and the complexity profile of the pair and the other one having y and the complexity profile of the pair, can establish via a probabilistic protocol with interaction on a public channel. We establish the communication complexity of secret key agreement protocols that produce a secret key of maximal length, for protocols with public randomness. We show that if the communication complexity drops below the established threshold then only very short secret keys can be obtained.
This is joint work with Andrei Romashchenko.
Speaker: Yasamin Nazari
Affiliation: JHU
Title: Distributed Distance-Bounded Network Design Through Distributed Convex Programming
Abstract:
Solving linear programs is often a challenging task in distributed settings. While there are good algorithms for solving packing and covering linear programs in a distributed manner, this is essentially the only class of linear programs for which such an algorithm is known. In this work we provide a distributed algorithm for solving a different class of convex programs which we call “distance-bounded network design convex programs”. These can be thought of as relaxations of network design problems in which the connectivity requirement includes a distance constraint (most notably, graph spanners). Our algorithm runs in O((D/ϵ)logn) rounds in the LOCAL model and finds a (1+ϵ)-approximation to the optimal LP solution for any 0<ϵ≤1, where D is the largest distance constraint. While solving linear programs in a distributed setting is interesting in its own right, this class of convex programs is particularly important because solving them is often a crucial step when designing approximation algorithms. Hence we almost immediately obtain new and improved distributed approximation algorithms for a variety of network design problems, including Basic 3- and 4-Spanner, Directed k-Spanner, Lowest Degree k-Spanner, and Shallow-Light Steiner Network Design with a spanning demand graph. Our algorithms do not require any "heavy" computation and essentially match the best-known centralized approximation algorithms, while previous approaches which do not use heavy computation give approximations which are worse than the best-known centralized bounds.
Speaker: Karthik Abinav Sankararaman
Affiliation: University of Maryland
Title: Adversarial Bandits with Knapsacks
Abstract: In this talk we will discuss the multi-armed bandits problem with resource constraints under the adversarial setting. In this problem, we have an interactive and repeated game between the algorithm and an adversary. Given T time-steps, d resources, m actions and budgets B1, B2, .. Bd, the algorithm chooses one of the m actions at each time-step. An adversary then reveals a reward and consumption for each of the d resources corresponding to this action. The time-step at which the algorithm runs out of the d resources (i.e., the total consumption for resource j > Bj), the game stops and the total reward is the sum of rewards obtained until the stopping time. The goal is to maximize the competitive ratio; the ratio of the total reward of the algorithm to the expected reward of a fixed distribution that knows all the rewards and consumption ahead of time. We give an algorithm for this problem whose competitive ratio is tight (matches the lower-bound). Moreover the algorithmic tools extends in an (almost) black-box fashion to also give an algorithm for the stochastic setting thus giving a “best-of-both-worlds” algorithm where the algorithm need not know a-priori if the input is adversarial or i.i.d. Finally we conclude with applications and special cases including the Dynamic Pricing problem.
This talk is based on a recent working paper with Nicole Immorlica, Rob Schapire and Alex Slivkins.
Speaker: Nithin Varma
Affiliation: Boston University
Title: Separating erasures and errors in property testing using local list decoding
Abstract:
Corruption in data can be in the form of erasures (missing data) or errors (wrong data). Erasure-resilient property testing (Dixit, Raskhodnikova, Thakurta, Varma ’16) and tolerant property testing (Parnas, Ron, Rubinfeld ’06) are two formal models of sublinear algorithms that account for the presence of erasures and errors in input data, respectively.
We first show that there exists a property P that has an erasure-resilient tester whose query complexity is independent of the input size n, whereas every tolerant tester for P has query complexity that depends on n. We obtain this result by designing a local list decoder for the Hadamard code that works in the presence of erasures, thereby proving an analog of the famous Goldreich-Levin Theorem. We also show a strengthened separation by proving that there exists another property R such that R has a constant-query erasure-resilient tester, whereas every tolerant tester for R requires n^{Omega(1)} queries. The main tool used in proving the strengthened separation is an approximate variant of a locally list decodable code that works against erasures.
Joint work with Sofya Raskhodnikova and Noga Ron-Zewi.
Speaker: Jalaj Upadhyay
Affiliation: JHU
Title: Differentially Private Spectral Sparsification of Graphs
Abstract:
In this talk, we will discuss differentially private spectral sparsification of graphs. We argue that traditional spectral sparsification where the output graph is a subgraph of the input graph is not possible with differential privacy. This motivates us to define a relaxed version of spectral sparsification of graphs.
We consider edge-level privacy, i.e., neighboring graphs differs in one edge with weight one. We give efficient $(\alpha,\beta)$-differentially private algorithms that, on input a dense graph G, construct a spectral sparsification of G. Our output graphs has $ O(n/\eps^2)$ weighted edges, which matches the best known non-private algorithms.
We can use our private sparse graph to solve various combinatorial and learning problems on graphs efficiently while preserving differential privacy. Some examples include all possible cut queries, Max-Cut, Sparse-Cut, Edge-Expansion, Laplacian eigenmaps, etc.
This talk is based on a joint work with Raman Arora and Vladimir Braverman.
Speaker: Ke Wu
Affiliation: Johns Hopkins University
Title: Synchronization Strings: Efficient and Fast Deterministic Constructions over Small Alphabets
Abstract:
Synchronization strings are recently introduced by Haeupler and Shahrasbi (STOC 2017) in the study of codes for correcting insertion and deletion errors (insdel codes). They showed that for any parameter ε>0, synchronization strings of arbitrary length exist over an alphabet whose size depends only on ε. Specifically, they obtained an alphabet size of O(ε^{−4}), which left an open question on where the minimal size of such alphabets lies between Ω(ε^{1}) and O(ε^{−4}). In this work, we partially bridge this gap by providing an improved lower bound of Ω(ε^{−3/2}), and an improved upper bound of O(ε^{−2}). We also provide fast explicit constructions of synchronization strings over small alphabets.
Further, along the lines of previous work on similar combinatorial objects, we study the extremal question of the smallest possible alphabet size over which synchronization strings can exist for some constant ε<1. We show that one can construct ε-synchronization strings over alphabets of size four while no such string exists over binary alphabets. This reduces the extremal question to whether synchronization strings exist over ternary alphabets.
Speaker: Sami Davies
Affiliation: University of Washington
Title: A Tale of Santa Claus, Hypergraphs, and Matroids
Abstract:
A well-known problem in scheduling and approximation algorithms is the Santa Claus problem. Suppose that Santa Claus has a set of gifts, and he wants to distribute them among a set of children so that the least happy child is made as happy as possible. Here, the value that a child i has for a present j is of the form p_{ij} \in \{0,p_j\}. A polynomial time algorithm by Annamalai et al. gives a 12.33-approximation algorithm and is based on a modification of Haxell’s hypergraph matching argument.
In this paper, we introduce a matroid version of the Santa Claus problem. Our algorithm is also based on Haxell’s augmentation tree, but with the introduction of the matroid structure we solve a more general problem with cleaner methods. Our result can then be used as a blackbox to obtain a (4 +\varepsilon)-approximation for Santa Claus. This factor also compares against a natural, compact LP for Santa Claus.
Speaker: Jalaj Upadhyay
Affiliation: Johns Hopkins Universit
Title: Towards Robust and Scalable Private Data Analysis
Abstract:
In the current age of big data, we are constantly creating new data which is analyzed by various platforms to improve service and user’s experience. Given the sensitive and confidential nature of these data, there are obvious security and privacy concerns while storing and analyzing such data. In this talk, I will discuss the fundamental challenges in providing robust security and privacy guarantee while storing and analyzing large data. I will also give a brief overview of my contributions and future plans towards addressing these challenges.
To give a glimpse of these challenges in providing a robust privacy guarantee known as differential privacy, I will use spectral sparsification of graphs as an example. Given the ubiquitous nature of graphs, differentially private analysis on graphs has gained a lot of interest. However, existing algorithms for these analyses are tailored made for the task at hand making them infeasible in practice. In this talk, I will present a novel differentially private algorithm that outputs a spectral sparsification of the input graph. At the core of this algorithm is a method to privately estimate the importance of an edge in the graph. Prior to this work, there was no known privacy preserving method that provides such an estimate or spectral sparsification of graphs.
Since many graph properties are defined by the spectrum of the graph, this work has many analytical as well as learning theoretic applications. To demonstrate some applications, I will show more efficient and accurate analysis of various combinatorial problems on graphs and the first technique to perform privacy preserving manifold learning on graphs.
Speaker: Martin Farach-Colton
Affiliation: Rutgers University
Title: TBA
Abstract: TBA
Speaker: Xue Chen
Affiliation: Northwestern University
Title: Active Regression via Linear-Sample Sparsification
Abstract:
E[||X \wt{\beta} – X\beta^*||_2^2] \leq \eps ||X \beta^* – y||_2^2.
This improves on the best previous result of O(d \log d + d/\eps) from leverage score sampling. We also present results for the inductive setting, showing when \wt{\beta} will generalize to fresh samples; these apply to continuous settings such as polynomial regression. Finally, we show how the techniques yield improved results for the non-linear sparse Fourier transform setting.
Bio: Xue Chen is broadly interested in randomized algorithms and the use of randomness in computation. Specific areas include Fourier transform, learning theory and optimization, and pseudorandomness. He obtained his Ph.D. at the University of Texas at Austin, under the supervision of David Zuckerman. Currently, he is a postdoctoral fellow in Northwestern University.
Speaker: Rediet Abebe
Affiliation: Cornell University
Title: Using Search Queries to Understand Health Information Needs in Africa
Abstract:
Access to healthcare and health information is of major global
concern. The stark inequality in the availability of health data by
country, demographic groups, and socioeconomic status impedes the
identification of major public health concerns and implementation of
effective interventions. This data gap ranges from basic disease
statistics, such as disease prevalence rates, to more nuanced
information, such as public attitudes. A key challenge is
understanding health information needs of under-served and
marginalized communities. Without understanding people’s everyday
needs, concerns, and misconceptions, health organizations lack the
ability to effectively target education and programming efforts.
In this presentation, we focus on the lack of comprehensive,
high-quality data about information needs of individuals in developing
nations. We propose an approach that uses search data to uncover
health information needs of individuals in all 54 nations in Africa.
We analyze Bing searches related to HIV/AIDS, malaria, and
tuberculosis; these searches reveal diverse health information needs
that vary by demographic groups and geographic regions. We also shed
light on discrepancies in the quality of content returned by search
engines.
We conclude with a discussion on computationally-informed
interventions both on- and off-line in health and related domains and
the Mechanism Design for Social Good research initiative.
Bio:
Rediet Abebe is a computer scientist with a strong interest in the
promotion of equality and justice. Her research focuses on algorithms,
AI, and their applications to social good. As part of this research
agenda, she co-founded and co-organizes Mechanism Design for Social
Good (MD4SG), an interdisciplinary, multi-institutional research
initiative with over 300 individuals. She is also a co-founder and
co-organizer of Black in AI, an international network of over 1000
individuals focused on increasing the presence and inclusion of Black
and African researchers in AI. Her research is deeply influenced by
her upbringing in her hometown of Addis Ababa, Ethiopia, where she
lived until moving to the U.S. in 2009. Her work has been generously
supported by fellowships and scholarships through Facebook, Google,
the Cornell Graduate School, and the Harvard-Cambridge Fellowship.
Speaker: Grigory Yaroslavtsev
Affiliation: Indiana University, Bloomington
Title: Advances in Hierarchical Clustering for Vector Data
Abstract:
Compared to the highly successful flat clustering (e.g. k-means), despite its important role and applications in data analysis, hierarchical clustering has been lacking in rigorous algorithmic studies until late due to absence of rigorous objectives. Since 2016, a sequence of works has emerged and gave novel algorithms for this problem in the general metric setting. This was enabled by a breakthrough by Dasgupta, who introduced a formal objective into the study of hierarchical clustering.
In this talk I will give an overview of our recent progress on models and scalable algorithms for hierarchical clustering applicable specifically to high-dimensional vector data. I will first discuss various linkage-based algorithms (single-linkage, average-linkage) and their formal properties with respect to various objectives. I will then introduce a new projection-based approximation algorithm for vector data. The talk will be self-contained and doesn’t assume prior knowledge of clustering methods.
Based on joint works with Vadapalli (ICML’18) and Charikar, Chatziafratis and Niazadeh (AISTATS’19)
Speaker: Arka Rai Choudhury
Affiliation: Johns Hopkins University
Title: Finding a Nash Equilibrium is No Easier than Breaking Fiat-Shamir
Abstract:
The Fiat-Shamir heuristic transforms a public-coin interactive proof into a non-interactive argument, by replacing the verifier with a cryptographic hash function that is applied to the protocol’s transcript. Constructing hash functions for which this transformation is sound is a central and long-standing open question in cryptography.
We show that solving the END-OF-METERED-LINE problem is no easier than breaking the soundness of the Fiat-Shamir transformation when applied to the sumcheck protocol. In particular, if the transformed protocol is sound, then any hard problem in #P gives rise to a hard distribution in the class CLS, which is contained in PPAD. Our result opens up the possibility of sampling moderately-sized games for which it is hard to find a Nash equilibrium, by reducing the inversion of appropriately chosen one-way functions to #SAT.
Our main technical contribution is a stateful incrementally verifiable procedure that, given a SAT instance over n variables, counts the number of satisfying assignments. This is accomplished via an exponential sequence of small steps, each computable in time poly(n). Incremental verifiability means that each intermediate state includes a sumcheck-based proof of its correctness, and the proof can be updated and verified in time poly(n).
Joint work with Pavel Hubacek, Chethan Kamath, Krzysztof Pietrzak, Alon Rosen, and Guy Rothblum.
Speaker: Amitabh Basu
Affiliation: JHU
Title: Admissibility of solution estimators for stochastic optimization
Abstract:
We look at stochastic optimization problems through the lens of statistical decision theory. In particular, we address admissibility, in the statistical decision theory sense, of the natural sample average estimator for a stochastic optimization problem (which is also known as the empirical risk minimization (ERM) rule in learning literature). It is well known that for general stochastic optimization problems, the sample average estimator may not be admissible. This is known as Stein’s paradox in the statistics literature. We show in this paper that for optimizing stochastic linear functions over compact sets, the sample average estimator is admissible.