Fall 2013
September 19, 2013
The primary challenge in digital forensics today is uncovering not the right answer, but the right question. As in any scientific discipline, the formation of viable hypotheses that ultimately uncover meaning in available evidence is a central problem in digital forensics. Such hypothesis formation, based on intuition and experience, involves an underlying mental process that can be substantially aided by computers. This seminar delves into the cognitive science of investigative reasoning, and how research in artificial intelligence can help humans find the right questions in large quantities of data. The implications of this work for digital identity and privacy, as well as its potential uses in other areas, such as medical diagnosis and virtual learning environments, are also discussed.
Speaker Biography: Eoghan Casey is an internationally recognized expert in digital forensics and data breach investigations. For over a decade, he has dedicated himself to advancing the field of digital forensics. He wrote the foundational book Digital Evidence and Computer Crime, now in its third edition, and he created advanced smartphone forensics courses taught worldwide. He has has also coauthored several advanced technical books including Malware Forensics, and is Editor-in-Chief of Digital Investigation: The International Journal of Digital Forensics and Incident Response. Dr. Casey received his Ph.D. from University College Dublin, and has taught digital forensics at Johns Hopkins University Information Security Institute.
Dr. Casey has worked as R&D Team Lead at the Defense Cyber Crime Center (DC3) helping enhance their operational capabilities and develop new techniques and tools. He has also helped organizations handle security breaches and analyzes digital evidence in a wide range of investigations, including network intrusions with international scope. He has testified in civil and criminal cases, and has submitted expert reports and prepared trial exhibits for computer forensic and cyber-crime cases.
Student Seminar
September 19, 2013
Surgeons performing highly skilled microsurgery tasks can benefit from information and manual assistance to overcome technological and physiological limitations to make surgery safer, efficient, and more successful. Vitreoretinal surgery is particularly difficult due to inherent micro-scale and fragility of human eye anatomy. Additionally, surgeons are challenged by physiological hand tremor, poor visualization, lack of force sensing, and significant cognitive load while executing high-risk procedures inside the eye, such as epiretinal membrane peeling. This dissertation presents the architecture and the design principles for a Surgical Augmentation Environment , which is used to develop innovative functionality to address these fundamental limitations in vitreoretinal surgery. It is an inherently information driven modular system incorporating robotics, sensors, and multimedia components. The integrated nature of the system is leveraged to create intuitive and relevant human-machine interferences and generate a particular system behavior to provide active physical assistance and present relevant sensory information to the surgeon. These include basic manipulation assistance, audio-visual and haptic feedback, intraoperative imaging and force sensing. The resulting functionality, and the proposed architecture and design methods generalize to other microsurgical procedures. The system’s performance is demonstrated and evaluated using phantoms and in-vivo experiments.
Speaker Biography: Marcin Balicki received a BSc in Interdisciplinary Engineering (2001) and Masters in Mechanical Engineering from Cooper Union (2004), where he was also an Adjunct Professor, teaching courses in engineering design and prototyping, engineering graphics, and product development. While teaching, he became interested in medical devices and joined the Minimally Invasive Surgery Lab, NYU School of Medicine. There he led software and hardware development of a novel navigation system for knee replacement surgery. For his PhD at The Johns Hopkins University, he has been working with a bioengineering research team under the direction of Russell H. Taylor to develop a breakthrough microsurgical system that incorporates new robotic manipulators, intra-ocular sensing devices, and visualization techniques to address the extreme challenges of vitreoretinal surgery. The goal of this system is to enable surgeons to perform currently impossible treatments, while also improving the safety and success rates of existing vitreoretinal procedures. Its ultimate benefit will be the millions of patients who suffer from blindness and difficult-to-treat eye conditions.
September 24, 2013
What effect does language have on people, and what effect do people have on language? You might say in response, “Who are you to discuss these problems?” and you would be right to do so; these are Major Questions that science has been tackling for many years. But as a field, I think natural language processing and computational linguistics have much to contribute to the conversation, and I hope to encourage the community to further address these issues. To this end, I’ll describe two efforts I’ve been involved in. The first project provides evidence that in group discussions, power differentials between participants are subtly revealed by how much one individual immediately echoes the linguistic style of the person they are responding to. We consider multiple types of power: status differences (which are relatively static), and dependence (a more ”situational” relationship). Using a precise probabilistic formulation of the notion of linguistic coordination, we study how conversational behavior can reveal power relationships in two very different settings: discussions among Wikipedians and arguments before the U.S. Supreme Court. Our second project is motivated by the question of what information achieves widespread public awareness. We consider whether, and how, the way in which the information is phrased — the choice of words and sentence structure — can affect this process. We introduce an experimental paradigm that seeks to separate contextual from language effects, using movie quotes as our test case. We find that there are significant differences between memorable and non-memorable quotes in several key dimensions, even after controlling for situational and contextual factors. One example is lexical distinctiveness: in aggregate, memorable quotes use less common word choices (as measured by statistical language models), but at the same time are built upon a scaffolding of common syntactic patterns. Joint work with Justin Cheng, Cristian Danescu-Niculescu-Mizil, Jon Kleinberg, and Bo Pang.
Speaker Biography: Lillian Lee is a professor of computer science at Cornell University. Her research interests include natural language processing, information retrieval, and machine learning. She is the recipient of the inaugural Best Paper Award at HLT-NAACL 2004 (joint with Regina Barzilay), a citation in “Top Picks: Technology Research Advances of 2004” by Technology Research News (also joint with Regina Barzilay), and an Alfred P. Sloan Research Fellowship; and in 2013, she was named a Fellow of the Association for the Advancement of Artificial Intelligence (AAAI). Her group’s work has received several mentions in the popular press, including The New York Times, NPR’s All Things Considered, and NBC’s The Today Show.
September 24, 2013
The field of reinforcement learning is concerned with the problem of learning efficient behavior from experience. In real life applications, gathering this experience is time-consuming and possibly costly, so it is critical to derive algorithms that can learn effective behavior with bounds on the experience necessary to do so. This talk presents our successful efforts to create such algorithms via a framework we call KWIK (Knows What It Knows) learning. I’ll summarize the framework, our algorithms, their formal validations, and their empirical evaluations in robotic and videogame testbeds. This approach holds promise for attacking challenging problems in a number of application domains.
Speaker Biography: Michael Littman joined Brown University’s Computer Science as a full professor after ten years (including 3 as department chair) at Rutgers University. His research in machine learning examines algorithms for decision making under uncertainty. Littman has earned multiple awards for teaching and his research. He has served on the editorial boards of the Journal of Machine Learning Research and the Journal of Artificial Intelligence Research. In 2013, he was general chair of the International Conference on Machine Learning (ICML) and program co-chair of the Association for the Advancement of Artificial Intelligence Conference and he served as program co-chair of ICML 2009.
October 1, 2013
Image registration has been an active multidisciplinary research area with a variety of applications including alignment of satellite photography, video-based inspection in manufacturing line, and medical imaging. The important goal of image registration is not only aligning two datasets acquired in different coordinate systems, but also incorporating knowledge acquired from one dataset into the other to enhance information. The prior knowledge acquired from various sources including patient-specific preoperative imaging, knowledge of the shape and materials of prosthetics, and statistical information on human organs, can augment information acquired by a variety of measurements such as optical motion capture, video camera, x-ray projection, and low-dose (noisy) CT acquisition via registration. In this seminar, we present the registration framework for incorporation of prior knowledge and our attempts to wide range of real-world applications including biomechanical human motion analysis, intraoperative guidance, and CT reconstruction. The future perspective of this work toward more sophisticated data assimilation such as statistical atlas-based registration, deformable tool tracking, and endoscope video-based localization, are also discussed.
Speaker Biography: Yoshito Otake received the B.Sc., M.Sc. and Ph.D. degrees from the Department of Electronics Information and Communication Engineering, Waseda University (Japan) in 2000, 2002 and 2007. From 2002 to 2008, he was an Assistant Professor at the Institute for High Dimensional Medical Imaging in Jikei University School of Medicine (Japan). Since 2008, he has been working at the Department of Computer Science and Biomedical Engineering in the Johns Hopkins University. He has won several awards, including a best paper award at ICPRAM 2013, poster awards at SPIE 2010-2013. He received postdoctoral fellowship for research abroad from JSPS during 2009-2011 and was selected as an attendee of the Lindau Nobel Prize Laureate Meetings in 2011. His research interests include medical image registration, image-guided surgery, human motion analysis and computational anatomy.
October 3, 2013
Alternative splicing is an essential property of eukaryotic genes: a single stretch of RNA is spliced into multiple variants, each capable of producing a protein with different functionality. The splice graph has emerged as a natural representation of a gene and its splice variants. However, distinguishing the true variants from among the thousands, if not millions of combinations mathematically encoded in the graph presents significant algorithmic challenges. I will present splice graph-based algorithms and software tools we developed to determine genes and their splice variants from high-throughput sequencing data, either conventional (Sanger) or produced with the more recent next generation sequencing technologies. Our methods take into account intrinsic properties of the sequence data to select a subset of candidate variants that captures the genes and their alternative splicing variations with high-accuracy.
Speaker Biography: Liliana Florea is an Assistant Professor with the McKusick-Nathans Institute of Genetic Medicine at the Johns Hopkins University School of Medicine, where she develops algorithms and software tools for analyzing biological data. She previously held faculty positions at the University of Maryland and at the George Washington University. Even earlier, she was a member of the team of scientists at Celera Genomics that sequenced and assembled the first human genome sequence. Dr. Florea was a recipient of a Sloan Foundation Research Fellowship in Computational and Evolutionary Molecular Biology and a finalist for the Microsoft Research Faculty Fellowship. She received her PhD in Computer Science and Engineering from the Pennsylvania State University in 2000.
October 10, 2013
The label “Cloud” attracts considerable hype and also sprawls across the immense ecosystem of servers, clients, mobile and cyber-physical computational, communication and storage elements. Despite this fuzz, fundamentally it is data and especially the integrity of the data across this spectrum of disparate Cloud elements that drives the value of the Cloud. While data consistency is indeed a key driver, this desired consistency also needs to be achieved efficiently (low latency wait-freedom, high availability) and in a robust manner resilient to the operational and deliberate disruptions (both crashes and attacks are a given with the ever expanding scale of Cloud!). This talk describes approaches for efficient & robust data consistency for Cloud Servers/Storage, and extensions to achieving consistency in mobile and cyber-physical medical scenarios.
Speaker Biography: Neeraj Suri is a Chair Professor of “Dependable Systems and Software” at TU Darmstadt, Germany and is affiliated with the Univ. of Texas at Austin and Microsoft Research. Following his PhD at the Univ. of Massachusetts at Amherst, he has held both industry and academic positions at Allied-Signal/Honeywell Research, Boston Univ., the Saab Endowed Chair at Chalmers in Sweden, receiving trans-national funding from the EC, German DFG, NSF/DARPA/ONR/AFOSR, NASA, Microsoft, IBM, Hitachi, GM and others. He is a recipient of the NSF CAREER award, as well as Microsoft and IBM Faculty Awards. Suri’s professional services span associate Editor-in-Chief for the IEEE Trans. on Dependable and Secure Computing, editorial boards for IEEE Trans. on SW Engg., IEEE TPDS, ACM Computing Surveys, IEEE Security & Privacy and many others. He serves on advisory boards for Microsoft (Trustworthy Computing Academic Advisory Board, Strategy Advisor for MSR-ATL’s) and multiple other US/EU/Asia industry and university advisory boards. Suri chaired the IEEE Technical Committee on Dependability and Fault Tolerance, and it’s Steering Committee.
October 15, 2013
The applications we use every day deal with privacy-sensitive data that come from different sources and entities, hence creating a tension between more functionality and privacy. Secure Multiparty Computation (MPC), a fundamental problem in cryptography and distributed computing, tries to resolve this tension by achieving the best of both worlds. But despite important and classic results, the practice of secure computation lags behind its theory by a wide margin.
In this talk, I discuss my work on a promising approach to making secure computation practical, namely server-aided MPC. This approach allows one to tap the resources of an untrusted cloud service to design more efficient and scalable privacy-preserving protocols. I discuss several variants of the server-aided model, our general- and special-purpose constructions for these variants, and the experimental results obtained form our implementations.
Speaker Biography: Payman Mohassel received his Ph.D. in Computer Science from University of California, Davis in 2009 (with Matthew Franklin). Since then he has held an assistant professor position at University of Calgary where he is currently employed. His research is in cryptography and information security with a focus on bridging the gap between the theory and practice of privacy-preserving computation.
October 22, 2013
Breaches of databases involving millions of passwords are becoming a commonplace threat to consumer security. Compromised passwords are also a feature of sophisticated targeted attacks, as the New York Times, for instance, reported of its own network intrusions early this year. The most common defense is password hashing. Hashing is the transformation of stored passwords using one-way functions that make verification of incoming passwords easy, but extraction of stored ones hard. “Hard,” though, often isn’t hard enough: Password cracking tools (such as “John the Ripper”) recover many hashed passwords quite effectively. I’ll describe a new, complementary approach called honeywords (an amalgam of “honeypots” and “passwords”). Honeywords are decoys designed to be indistinguishable from legitimate passwords. When seeded in a password database, honeywords pose a challenge even to an adversary that compromises the database and cracks its hashed passwords. The adversary must still guess which passwords are legitimate, and is very likely to pick a honeyword instead. The adversary’s submission of a honeyword is detectable in a backend system, which can raise an alarm to signal a breach. I’ll also briefly discuss a related idea, called honey-encryption, which creates ciphertexts that decrypt under incorrect keys to seemingly valid (decoy) messages.
Honeywords and honey-encryption represent some of the first steps toward the principled use of decoys, a time-honored and increasingly important defense in a world of frequent, sophisticated, and damaging security breaches.
Honeywords are honey-encryption are joint work respectively with Ron Rivest (MIT) and Tom Ristenpart (U. Wisc).
Speaker Biography: Ari Juels is a roving computer security specialist. He was previously Chief Scientist of RSA, The Security Division of EMC, where he worked from 1996-2013. His recent areas of interest include “big data” security analytics, cybersecurity, user authentication, cloud security, privacy, biometric security, and RFID / NFC security. As an industry scientist, Dr. Juels has helped incubate innovative new product features and products and advised on the science behind security-industry strategy. He is also a frequent public speaker, and has published highly cited scientific papers on many topics in computer security.
In 2004, MIT’s Technology Review Magazine named Dr. Juels one of the world’s top 100 technology innovators under the age of 35. Computerworld honored him in its “40 Under 40″ list of young industry leaders in 2007. He has received other distinctions, but sadly no recent ones acknowledging his youth.
Distinguished Lecturer
October 29, 2013
Cyber-security today is focused largely on defending against known attacks. We learn about the latest attack and find a patch to defend against it. Our defenses thus improve only after they have been successfully penetrated. This is a recipe to ensure some attackers succeed—not a recipe for achieving system trustworthiness. We must move beyond reacting to yesterday’s attacks and instead start building systems whose trustworthiness derives from first principles. Yet, today we lack such a science base for cybersecurity. That science of security would have to include attacks, defense mechanisms, and security properties; its laws would characterize how these relate. This talk will discuss examples of such laws and suggest avenues for future exploration.
Speaker Biography: Fred Schneider is the Samuel B. Eckert Professor of Computer Science at Cornell and also serves as the Chief Scientist for the NSF-funded TRUST Science and Technology Center, which brings together researchers at U.C. Berkeley, Carnegie-Mellon, Cornell, Stanford, and Vanderbilt. He is a a fellow of AAAS, ACM, and IEEE, was awarded a Doctor of Science honoris causa by the University of NewCastle-upon-Tyne, and received the 2012 IEEE Emanuel R. Piore Award for “contributions to trustworthy computing through novel approaches to security, fault-tolerance and formal methods for concurrent and distributed systems”. The U.S. National Academy of Engineering elected Schneider to membership in 2011, and the Norges Tekniske Vitenskapsakademi (Norwegian Academy of Technological Sciences) named him a foreign member in 2010.
November 7, 2013
Cluster management is the term that Google uses to describe how we control the computing infrastructure in our datacenters that supports almost all of our external services. It includes allocating resources to different applications on our fleet of computers, looking after software installations and hardware, monitoring, and many other things. My goal is to present an overview of some of these systems, introduce Omega, the new cluster-manager tool we are building, and present some of the challenges that we’re facing along the way. Many of these challenges represent research opportunities, so I’ll spend the majority of the time discussing those.
Speaker Biography: John Wilkes has been at Google since 2008, where he is working on cluster management and infrastructure services. Before that, he spent a long time at HP Labs, becoming an HP and ACM Fellow in 2002. He is interested in far too many aspects of distributed systems, but a recurring theme has been technologies that allow systems to manage themselves. In his spare time he continues, stubbornly, trying to learn how to blow glass.
November 14, 2013
One of the key problems we face with the accumulation of massive datasets (such as electronic health records and stock market data) is the transformation of data to actionable knowledge. In order to use the information gained from analyzing these data to intervene to, say, treat patients or create new fiscal policies, we need to know that the relationships we have inferred are causal. Further, we need to know the time over which the relationship takes place in order to know when to intervene. In this talk I discuss recent methods for finding causal relationships and their timing from uncertain data with minimal background knowledge and their applications to observational health data.
Speaker Biography: Samantha Kleinberg is an Assistant Professor of Computer Science at Stevens Institute of Technology. She received her PhD in Computer Science from New York University in 2010 and was a Computing Innovation Fellow at Columbia University in the Department of Biomedical informatics from 2010-2012. Her research centers on developing methods for analyzing large-scale, complex, time-series data. In particular, her work develops methods for finding causes and automatically generating explanations for events, facilitating decision-making using massive datasets. She is the author of Causality, Probability, and Time (Cambridge University Press, 2012), and PI of an R01 from the National Library of Medicine.
Distinguished Lecturer
November 20, 2013
We are at the cusp of a major transformation in higher education. In the past year, we have seen the advent of MOOCs – massively open online classes (MOOCs) – top-quality courses from the best universities offered for free. These courses exploit technology to provide a real course experience to students, including video content, interactive exercises with meaningful feedback, using both auto-grading and peer-grading, and rich peer-to-peer interaction around the course materials. We now see MOOCs from dozens of top universities, offering courses to millions of students from every country in the world. The courses start from bridge/gateway courses all the way through graduate courses, and span a range of topics including computer science, business, medicine, science, humanities, social sciences, and more. In this talk, I’ll discuss this far-reaching experiment in education, including some examples and preliminary analytics. I’ll also discuss why we believe this model can support an improved learning experience for on-campus students, via blended learning, and provide unprecedented access to education to millions of students around the world.
Speaker Biography: Daphne Koller is the Rajeev Motwani Professor of Computer Science at Stanford University and the co-founder and co-CEO of Coursera, a social entrepreneurship company that works with the best universities to connect anyone around the world with the best education, for free. Coursera is the leading MOOC (Massive Open Online Course) platform, and has partnered with dozens of the world’s top universities to offer hundreds of courses in a broad range of disciplines to millions of students, spanning every country in the world. In her research life, she works in the area of machine learning and probabilistic modeling, with applications to systems biology and personalized medicine. She is the author of over 200 refereed publications in venues that span a range of disciplines, and has given over 15 keynote talks at major conferences. She is the recipient of many awards, which include the Presidential Early Career Award for Scientists and Engineers (PECASE), the MacArthur Foundation Fellowship, the ACM/Infosys award, and membership in the US National Academy of Engineering. She was recently recognized as one of Time Magazine’s 100 Most Influential People for 2012. She is also an award winning teacher, who pioneered in her Stanford class many of the ideas that underlie the Coursera user experience. She received her BSc and MSc from the Hebrew University of Jerusalem, and her PhD from Stanford in 1994.
Distinguished Lecturer
November 21, 2013
The Fast Fourier Transform (FFT) is one of the most fundamental numerical algorithms. It computes the Discrete Fourier Transform (DFT) of an n-dimensional signal in O(n log n) time. The algorithm plays an important role in many areas. It is not known whether its running time can be improved. However, in many applications, most of the Fourier coefficients of a signal are “small” or equal to zero, i.e., the output of the transform is (approximately) sparse. In this case, it is known that one can compute the set of non-zero coefficients faster than in O(n log n) time.
In this talk, I will describe a set of efficient algorithms for sparse Fourier Transform. One of the algorithms has the running time of O(k log n), where k is the number of non-zero Fourier coefficients of the signal. This improves over the runtime of the FFT for any k = o(n). If time allows, I will also describe some of the applications, to spectrum sensing and GPS locking, as well as mention a few outstanding open problems. The talk will cover the material from the joint papers with Fadel Adib, Badih Ghazi, Haitham Hassanieh, Dina Katabi, Eric Price and Lixin Shi. The papers are available at http://groups.csail.mit.edu/netmit/sFFT/.
Speaker Biography: Piotr Indyk is a Professor of Electrical Engineering and Computer Science at MIT. He joined MIT in 2000, after earning PhD from Stanford University. Earlier, he received Magister degree from Uniwersytet Warszawski in 1995. Piotr’s research interests lie in the design and analysis of efficient algorithms. Specific interests include: high-dimensional computational geometry, sketching and streaming algorithms, sparse recovery and compressive sensing. He has received the Sloan Fellowship (2003), the Packard Fellowship (2003) and the Simons Investigator Award (2013). His work on sparse Fourier sampling has been named to Technology Review “TR10” in 2012, while his work on locality-sensitive hashing has received the 2012 ACM Kanellakis Theory and Practice Award.
Student Seminar
November 25, 2013
Wireless sensor networks for environmental monitoring present a set of domain-specific challenges. Their spatially heterogeneous structure and susceptibility to hardware and wireless connectivity failure demand a focused approach.
In this seminar, I will present the “Breakfast” suite of hardware, which enables domain scientists to allocate sensing, communication, and storage functionality in a flexible and natural manner. I will show how one can improve data delivery in the face of node and link failures by using synchronized, non-destructive packet transmissions by a set of nodes rather than single-path routes.
This seminar will explain the challenges of this application domain, present a unified hardware suite that addresses these challenges, and characterize the benefits of our multi-tiered, multi-transmitter networking approach. This work improves on the throughput and energy usage of naive multi-transmitter flooding while maintaining reliability in the face of high levels of node failure and out-of-date link quality information.
Speaker Biography: Doug Carlson is a Ph.D. candidate in the Computer Science Department at Johns Hopkins University, where is advised by Dr. Andreas Terzis. He received a B.S. in Computer Science from Duke University in 2004 and spent several years as an IT consultant before joining JHU in 2008. His research focuses on energy-efficient medium access and networking protocols, as well as end-to-end system design for wireless sensor networks.
December 3, 2013
Several results are given for the problem of identifying the set of faulty processors in a multiprocessor system on the basis of a given collection of test results performed by the processors of the systems on one another.
For the general case of bounded combinations of permanent and intermittent faults, known as hybrid fault situations, necessary and sufficient conditions are given for identifying a processor as faulty in spite of unapplied tests and intermittencies. Based on this approach, a design for intermittent/transient-upset tolerant systems is given. For the special case of all permanent faults, a class of systems is characterized in which the set of faulty processors can be identified in a straightforward manner based on any given collection of test results. Finally, it is shown that the classic tp-diagnosable systems introduced in the 1960s by Preparata, Metze and Chien, possess heretofore unknown graph-theoretic properties relative to minimum vertex cover sets and maximum matchings. An O(n2.5) algorithm is given which exploits these properties to identify the set of faulty processors in a tp-diagnosable system.
Note: This talk is a 30th-anniversary reprise of my PhD dissertation defense.
Speaker Biography: Anton (Tony) Dahbura received the BSEE, MSEE, and PhD in Electrical Engineering and Computer Science from the Johns Hopkins University in 1981, 1982, and 1984, respectively. From 1983 until 1996 he was a researcher at AT&T Bell Laboratories, was an Invited Lecturer in the Department of Computer Science at Princeton University, and served as Research Director of the Motorola Cambridge Research Center in Cambridge, Massachusetts. From 1996-1999 he was a consultant to Digital Equipment Corporation’s (now HP) Cambridge Research Laboratory where he pioneered research and development in mobile, wireless, and wearable computing. From 1996-2012 he served at Hub Labels, Inc. as Corporate Vice President. In January, 2012 he was named Interim Executive Director of the Johns Hopkins University Information Security Institute in Baltimore. From 2000-2002 he served as Chair of the Johns Hopkins University Engineering Alumni and in 2004 was the recipient of the Johns Hopkins Heritage Award for his service to the University. He chaired The Johns Hopkins Computer Science Department Advisory Board from 1998 until 2012 and also served on the Johns Hopkins University Whiting School of Engineering National Advisory Council during that time.
Student Seminar
December 9, 2013