Today, computer systems need to cope with the explosive growth of data in the world. For instance, in data-center networks, monitoring systems are used to measure traffic statistics at high speed; and in financial technology companies, distributed processing systems are deployed to support graph analytics. To fulfill the requirements of handling such large datasets, we build efficient networked systems in a distributed manner most of the time. Ideally, we expect the systems to meet service-level objectives (SLOs) using least amount of resource. However, existing systems constructed with conventional in-memory algorithms face the following challenges: (1) excessive resource requirements (e.g., CPU, ASIC, and memory) with high cost; (2) infeasibility in a larger scale; (3) processing the data too slowly to meet the objectives.
To address these challenges, we propose sketching techniques as a tool to build more efficient networked systems. Sketching algorithms aim to process the data with one or several passes in an online, streaming fashion (e.g., a stream of network packets), and compute highly accurate results. With sketching, we only maintain a compact summary of the entire data and provide theoretical guarantees on error bounds.
This dissertation argues for a sketching based design for large-scale networked systems, and demonstrate the benefits in three application contexts: (i) Network monitoring: we build generic monitoring frameworks that support a range of applications on both software and hardware with universal sketches. (ii) Graph pattern mining: we develop a swift, approximate graph pattern miner that scales to very large graphs by leveraging graph sketching techniques. (iii) Halo finding in astronomy simulations: we design scalable halo finders on CPU and GPU by leveraging sketch-based heavy hitter algorithms.
Speaker Biography
Zaoxing (Alan) Liu is a Ph.D. candidate in the Department of Computer Science at Johns Hopkins University, working with Prof. Vladimir Braverman. He received his master degree in Computer Science from JHU in 2013 and enrolled in the Ph.D. program in 2014. He is working in the areas of systems and algorithms with sublinear time/space requirements. His research aims to build efficient networked systems with strong theoretical guarantees. Liu’s research papers have been published in top venues, including ACM SIGCOMM and USENIX OSDI. He will be joining CMU/Harvard to continue exploring the problems in this field, as a postdoc researcher.