2019 CIS Seminar Series
Ian Cosden, Princeton University, HPC Software Engineer and Performance Tuning
Gore Hall 103 at 11 a.m.
2018-19 Distinguished Speaker Series
“Collaborate.org – building a worldwide cloud infrastructure for everyone in the world to work together to solve the world’s problems”
November 5th, 2018 | Brown Lab 101 | 12:00PM
Abstract: As we look to the future, humankind faces challenges that are increasing, while our resources are declining. To survive and thrive we need to work together – no one individual or organization, agency or even government can address these global challenges all by themselves.
Collaborate.org was borne from hundreds of interactions across government, academia, industry, nonprofits and social sectors. Across all these groups we kept hearing the same themes – to access and integrate all kinds of data to understand what is happening, to work across organizations, to derive a shared vision, and to coordinate action, share resources, and achieve more than any organization or entity can achieve alone.
Collaborate.org is an open global online community of people, working together and sharing resources, expertise and enthusiasm, empowered with advanced technologies including collaboration and visualization tools, and all the world’s geospatial data at your fingertips real- time sensor data, GIS and database information, news/RSS and social media, and satellite and aerial imagery. It currently hosts over 2.5M layers (over 5PB), with the ability for users to upload and share their own data with those they wish. It provides a comprehensive suite of collaboration tools (videoconferencing /messaging, file sharing, forums/blogs, calendars/task lists, etc) over a secure worldwide cloud computing infrastructure with the goal of enabling everyone to work together, connect with others, and together make an extraordinary, exponential positive impact on the world.
This talk will focus on the rationale behind global collaboration, but also dive into the technology behind hosting such a service- worldwide cloud computing, massive multimodal data integration at scale, extreme security, analytics and visualization, and will discuss several projects currently using the platform to make significant impact and an exploration of business models and techniques to incentivize engagement and mass social change on an exponential scale. If not to do good with technology, then what are we building it all for?
Bio: Dr. Kevin Montgomery is the founder of Collaborate.org, an award-winning site for global impact. He was previously a Senior Researcher at the Center for Innovation in Global Health at Stanford University, and cofounder of the National Biocomputation Center at Stanford University, which developed technologies for computer-based surgical planning, augmented reality surgery, surgical simulators, anatomical atlases, as well as wireless telemedicine and telemetry.
Before joining Stanford, he led teams at the NASA Ames Research Center to develop systems for 3D reconstruction and visualization of biomedical imaging data and at the Hewlett-Packard Company on networking protocol design and implementation. He earned his PhD in Computer Engineering from the University of California.
Dr Montgomery was awarded the Entrepreneur of the Year by the Strategic News Service, and an Edison Innovation Award for his work on Collaborate.org.
He regularly serves on several study/review sections for DoD, NIH, NSF, and other granting agencies, and has been a serial entrepreneur, leading and advising several high-tech companies in the Silicon Valley.
2017-18 Distinguished Speaker Series
Baylor College of Medicine and Rice University
“Parallel Processing of the Genomes, by the Genomes and for the Genomes”
October 17, 2017 | Center for the Arts, Gore Recital Hall | 10:15 AM
Abstract: The human genome is a sequence of 3 billion chemical letters inscribed in a molecule called DNA. Famously, short stretches (~10 letters, or a-base pairs) of DNA fold into a double helix. But what about longer pieces? How does a 2 meter long macromolecule, the genome, fold up inside a 6 micrometer wide nucleus? And, once packed, how does the information contained in this ultra-dense structure remain accessible to the cell? This talk will discuss how the human genome folds in three dimensions, a configuration that enables the cell to access and process massive quantities of information in parallel. To probe how genomes fold, we developed Hi-C, a method that can determine not only the genome’s 1D sequence, but its 3D fold Hi-C maps collisions between pairs of DNA sequences as they fluctuate inside the nucleus (Lieberman Aiden et al., Science, 2009; Rao & Huntley et al., Cell, 2014). To reconstruct the underlying folds from the billions of collisions we record, we, too must engage in massively parallel computation. Working together with IBM, NVIDIA, Mellanox, and Edico Genome, we built a specialized hardware platform integrating graphical processing units and field programmable gate arrays to accelerate our research. In one recent application, we determined the genome sequence of Aedes aegypti, the mosquito that carries Zika virus, using a new methodology that exploits folding patterns to assemble genome sequences more easily (Dudchenko et al., Science, 2017). Assembing this genome had been an urgent biomedical challenge that had been highlighted, only a few months before, on the front page of the New York Times.
Bio: Erez Lieberman Aiden received his PhD from Harvard and MIT in 2010. After several years at Harvard’s Society of Fellows and at Google as Visiting Faculty, he became Assistant Professor of Genetics at Baylor College of Medicine and of Computer Science and Applied Mathematics at Rice University. Dr. Aiden’s inventions include the Hi-C method for three-dimensional DNA sequencing, which enables scientists to examine how the two-meter long human genome folds up inside the tiny space of the cell nucleus. In 2014, his laboratory reported the first comprehensive map of loops across the human genome, mapping their anchors with single-base-pair resolution. In 2015, his lab showed that these loops form by extrusion, and that it is possible to add and remove loops and domains in a predictable fashion using targeted mutations as short as a single base pair. Together with Jean-Baptiste Michel, Dr. Aiden also developed the Google Ngram Viewer, a tool for probing cultural change by exploring the frequency of words and phrases in books over the centuries. The Ngram Viewer is used every day, by millions of users worldwide. Dr. Aiden’s research has won numerous awards, including recognition for one of the top 20 “Biotech Breakthroughs that will Change Medicine”, by Popular Mechanics, membership in Technology Review’s 2009 TR35, recognizing the top 35 innovators under 35; and in Cell’s 2014 40 Under 40. His work has been featured on the front page of the New York Times, the Boston Globe, the Wall Street Journal, and the Houston Chronicle. One of his talks has been viewed over 1 million times at TED.com. Three of his research papers have appeared on the cover of Nature and Science. In 2012, he received the Presidents Early Career Award in Science and Engineering, the highest government honor for young scientists, from Barack Obama. In 2014, Fast Company called him “America’s brightest young academic.” In 2015, his laboratory was recognized on the floor of the US House of Representatives for its discoveries about the structure of DNA.
University of Illinois at Urbana-Champaign
“The Science of Computational Reproducibility”
April 6, 2018 | Center for the Arts, Gore Recital Hall | 10:15 AM
Abstract: The rate of production, collection, and analysis of data, and the speed at which computational infrastructure is changing (e.g. technologies for cloud computing, network capabilities, and high performance computing systems) implies a need for extreme agility in computationally-enabled research. I will outline a research agenda for the science of reproducibility that responds to the opportunities created by this rapid evolution in research environments addressing, for example, reliability and robustness of machine learning discoveries, quantification of variability in data and cyberinfrastructure on scientific findings, and new facets of the research pipeline that impact our ability to generalize and use the products of scientific research.
Bio: Victoria Stodden is an associate professor in the School of Information Sciences at the University of Illinois at Urbana-Champaign, with affiliate appointments in the School of Law, the Department of Computer Science, the Department of Statistics, the Coordinated Science Laboratory, and the National Center for Supercomputing Applications. In addition, she is an affiliate scholar with the Center for Internet and Society at Stanford Law School, a faculty affiliate of the Meta-Research Innovation Center at Stanford (METRICS), and a visiting scholar at the Social and Decision Analytics Laboratory at the Biocomplexity Institute at Virginia Tech. Victoria completed both my PhD in statistics and my law degree at Stanford University. Her research centers on the multifaceted problem of enabling reproducibility in computational science. This includes studying adequacy and robustness in replicated results, designing and implementing validation systems, developing standards of openness for data and code sharing, and resolving legal and policy barriers to disseminating reproducible research. Victoria created the “Reproducible Research Standard,” a suite of open licensing recommendations for the dissemination of computational results, and winner of the Kaltura Prize for Access to Knowledge Writing.
2016-17 Distinguished Speaker Series
Francine Berman, Ph.D.
Edward P. Hamilton Distinguished Professor in Computer Science, Rensselaer Polytechnic Institute
“Got Data? Building a Sustainable Data Ecosystem”
December 6, 2016 | Center for the Arts, Gore Recital Hall | 2:00PM
Abstract: Innovation in a digital world presupposes that the data will be there when people need it, but will it? Without sufficient data infrastructure and attention to the stewardship and preservation of digital data, data may become inaccessible or lost. This is particularly problematic for data generated by sponsored research projects where the focus is on innovation rather than infrastructure, and support for stewardship and preservation may be short term and ad hoc.
In this presentation, Berman will discuss sustainability, infrastructure and data, and will explore the opportunities and challenges of creating a viable ecosystem for the data on which current and future research and innovation increasingly depend.
We will also present some recent work from our lab on an end-to-end robotic system that can be used to recognize, unfold, place flat, iron, and fold deformable objects such as shirts and pants. The system is based on predictive thin-shell models in simulation to understand the properties of physical clothing, and is shown to work on a variety of different garments.
Bio: Francine Berman is the Edward P. Hamilton Distinguished Professor in Computer Science at Rensselaer Polytechnic Institute. She is U.S. lead of the Research Data Alliance, a community-driven international organization created to accelerate research data sharing worldwide. Berman is a fellow of the Association of Computing Machinery (ACM), the Institute of Electrical and Electronics Engineers (IEEE), and the American Association for the Advancement of Science (AAAS). In 2009, she was the inaugural recipient of the ACM/IEEE-CS Ken Kennedy Award for “influential leadership in the design, development, and deployment of national-scale cyberinfrastructure.” In 2015, she was nominated by President Barack Obama and confirmed by the U.S. Senate to become a member of the National Council on the Humanities. For her accomplishments, leadership, and vision, Berman was recognized by the Library of Congress as a “Digital Preservation Pioneer,” as one of the top women in technology by BusinessWeek and Newsweek, and as one of the top technologists by IEEE Spectrum.
Mark Giesbrecht, Ph.D.
Professor and Director, David R. Cheriton School of Computer Science, University of Waterloo
“Eigenvalues, elimination, and random integer matrices, and some speculative applications to computing with sparse matrices”
April 7, 2017 | Center for the Arts, Gore Recital Hall | 10:30AM
Abstract: Integer matrices are typically characterized by the “lattice” of combinations of their rows or columns. This is captured nicely by the Smith canonical form, a diagonal matrix of “invariant factors,” to which any integer matrix can be transformed through left and right multiplication by unimodular matrices.
But integer matrices can also be viewed as complex matrices, with eigenvalues and eigenvectors, and every such matrix is similar to a unique one in Jordan canonical form.
It would seem a priori that the invariant factors and the eigenvalues would have little to do with each other. Yet we will show that for “almost all” matrices the invariant factors and the eigenvalues are equal under a p-adic valuation, in a very precise sense.
A much-hoped-for link is explored for fast computation of Smith forms of sparse integer matrices, via the better understood algorithms for computing eigenvalues. All the methods are elementary and no particular background beyond linear algebra will be assumed.
Bio: Dr. Mark Giesbrecht is a Professor and Director of the David R. Cheriton School of Computer Science at the University of Waterloo. He received a B.Sc. from the University of British Columbia in 1986, and a Ph.D from the University of Toronto in 1993. He is an ACM Distinguished Scientist, and former Chair of ACM SIGSAM (Special Interest Group on Symbolic and Algebraic Manipulation) and the International Symposium on Symbolic and Algebraic Computation (ISSAC) Steering Committees, as well as serving as ISSAC Program Committee Chair. His research interests are in symbolic computation and computer algebra, as well as computational linear algebra.
2015-16 Distinguished Speaker Series
Professor, Department of Computer Science Columbia University
“Net Generation Surgical Robotics and Deformable Object Recognition and Manipulation”
November 13, 2015 | Trabant Theater | 1:00PM
Abstract: Single-port and NOTES surgery are emerging as important new thrusts in minimally invasive medicine. Both of these models require a new generation of robotic devices, sensors, controls and interfaces. In this talk, we first present an in-vivo stereoscopic camera system that can serve as the imaging component of such a platform. The system is equipped with pan/tilt axes and can be inserted through a single incision and mounted on the inside of the body. Second, we have built an Insertable Robotic Effector Platform (IREP) that combines the imaging device with two continuum robot snake-like arms, creating a single port robotic surgery platform. Finally, we will describe a Surgical Structured Light (SSL) system that can recover 3D models of in-vivo anatomy using standard laparoscopes. We will present results from ex-vivo, animal, and human experiments.
We will also present some recent work from our lab on an end-to-end robotic system that can be used to recognize, unfold, place flat, iron, and fold deformable objects such as shirts and pants. The system is based on predictive thin-shell models in simulation to understand the properties of physical clothing, and is shown to work on a variety of different garments.
Bio: Peter Allen is Professor of Computer Science at Columbia University where he heads the Columbia Robotics Laboratory. He received the A.B. degree from Brown University in Mathematics-Economics, the M.S. in Computer Science from the University of Oregon and the Ph.D. in Computer Science from the University of Pennsylvania, where he was the recipient of the CBS Foundation Fellowship, Army Research Office fellowship and the Rubinoff Award for innovative uses of computers. His current research interests include robotic grasping, 3-D vision and modeling, and medical robotics.
Terry Gaasterland, Ph.D.
Professor of Computational Biology and Genomics Director, Scripps Genome Center
“Tying Genome Variation in Regulatory Regions to Neurodegenerative Phenotypes”
April 7, 2016 | Willard Hall 007 | 2:00PM
Abstract: The use of exome sequencing (genome-wide sequencing enriched to cover exons) to identify disease-causing genomic changes has focused to date on the interpretation of variants within protein coding regions. We have developed computational and statistical methods to use exome data to examine introns, promoters and untranslated regions (UTRs) in the context of 1 of 3 03/29/2016 11:24 AM disease cohorts compared with general populations including the 1000 Genomes (1000G), Exome Sequencing Project (ESP), and the Exome Aggregation Consortium (ExAC) and control populations in the Alzheimer’s Disease Sequencing Project. These methods were applied to 333 patients of European descent and 12 patients of African descent with progressive optic nerve degeneration due to primary open angle glaucoma (POAG), 13 patients of Taiwanese descent with migraine, and 32 patients of mixed descent born with congenital glaucoma. All patients analyzed have first-degree relatives with disease, thus increasing the likelihood of finding genetic explanations for disease. Relationships between sequenced individuals, if any, are known. The genetics of POAG, migraine, and congenital glaucoma are complex; to date, no single causative genomic variant has been established as causing any of these diseases. Our analysis of regulatory regions provides new insights for further functional study through targeted experiments. In the case of POAG, genome-wide sequencing of exons from protein coding and non-coding genes in 285 patients revealed ~50 associated SNP sites within ~30 genes. Of these, two-thirds were located in introns or untranslated regions (UTR). To rank and prioritize genes and generate hypotheses about molecular mechanisms disrupted by associated variant sites, mRNA and small RNA (microRNA) were sequenced from ocular tissues relevant to the disease. Intronic SNPs were assessed for impact on alternative splice isoforms, and UTR SNPs were assessed for impact on microRNA binding. An additional cohort of associated SNPs appear between genes and in follow-up analysis may implicate enhancers or promoters in disease processes. Analysis protocols and techniques for integrated data interpretation to construct putative regulatory networks underlying disease will be discussed. The data collection and analysis methods are generally applicable beyond glaucoma to other chronic, progressive diseases associated with aging.
Bio: Dr. Terry Gaasterland is a computational molecular biologist trained as a computer scientist. She earned her undergraduate degree in Computer Science and Russian with a minor in Chemistry from Duke University as an A.B. Duke Scholar, and her Ph.D. in Computer Science from University of Maryland. As an Enrico Fermi Fellow at the Department of Energy’s Argonne National Laboratory and then as an Assistant Professor of Computer Science at the University of Chicago, she applied techniques from her work in “cooperative answering”, natural language processing, and deductive database research to the interpretation of the first three DOE-funded microbial genomes and a fourth Canadian-funded archaeal genome. During seven years as a Head of Lab at Rockefeller University, Dr. Gaasterland focused on the integration of gene expression data and genome sequence data analysis in human and model eukaryotic organisms. Ten years ago, Dr. Gaasterland moved her Laboratory of Computational Genomics to UCSD to establish the Scripps Genome Center, a UCSD resource based at the Scripps Institution of Oceanography in the Marine Biology Division, with bioinformatics hardware and software housed at the San Diego Supercomputer Center. At UCSD, she is now Professor of Computational Biology and Genomics at SIO and a faculty member in UCSD’s Institute for Genomic Medicine. Since receiving the Presidential Early Career Award in Science and Engineering (PECASE) in 2000, she has been continuously funded by the National Science Foundation and the National Institutes of Health to develop and use methods in computational genomics. Her accomplishments in computational molecular biology as well as her early career work in 2 of 3 03/29/2016 11:24 AM deductive databases is reflected in over 90 refereed publications, with over 80 indexed in PubMed. A member of the NEIGHBOR Consortium to study primary open angle glaucoma (POAG) and the NHGRI Medical Sequencing program, Dr. Gaasterland is sequencing and analyzing variation in transcribed exons genome-wide for 400 primary open angle glaucoma cases and controls. Her work aims to address the general question of how regulation of transcription and translation modulate and affect cell state changes.
Olga and Alberico Pompa Professor of Engineering and Applied Science, University of Pennsylvania
“The Path(s) to a 10 Tbsp Workstation”
April 15, 2016 | Trabant Theater | 1:00PM
2014-15 Distinguished Speaker Series
Computer Scientist and Program Manager, ASCR, Department of Energy’s Office of Science
“Science at Extreme Scale: Big Data and Big Compute”
October 2, 2014 | Trabant Theater | 10:30AM
Abstract: Management, analysis and visualization of extreme-scale scientific data will undergo radical change during the coming decade. Coupled with changes in the hardware architecture of next-generation supercomputers, explosive growth in the volume of scientific data presents a host of challenges to researchers in computer science, mathematics and statistics, and application sciences. Failure to develop new data management, analysis and visualization technologies that operate effectively on the changing architecture will cripple scientific discovery and put national security at risk. Using examples from climate science, Dr. Lucy Nowell will explore the technical and scientific drivers and opportunities for data science research funded by the Advanced Scientific Computing Research program in the Department of Energy’s Office of Science.
Bio: Dr. Lucy Nowell is a computer scientist and program manager in the Office of Advanced Scientific Computing Research (ASCR) within the Department of Energy’s Office of Science. She manages a broad spectrum of ASCR-funded computer science research, with a particular emphasis on scientific data management and analysis. Dr. Nowell moved to ASCR in the spring of 2009 from Pacific Northwest National Laboratory (PNNL), where she was a Chief Scientist in the Information Analytics group. Before coming to ASCR, Dr. Nowell was on temporary duty, serving for two years as a Program Director in the area of Data, Data Analysis and Visualization with the Office of Cyberinfrastructure at National Science Foundation (NSF). She also served for four years as a research Program Manager for the Department of Defense, managing a variety of projects related to information analysis and visualization.
Dr. Nowell joined PNNL in August 1998 after a career as a professor at Lynchburg College in Virginia, where she taught a wide variety of courses in Computer Science and Theatre. She also headed the Theatre program and later chaired the Computer Science Department. While pursuing her Master of Science and doctorate in computer science at Virginia Tech in Blacksburg, Virginia, she worked as a Research Scientist in the Digital Libraries Research Laboratory and also interned with the Information Access team at IBM’s T. J. Watson Research Laboratories in Hawthorne, NY. Her B.A. and M.A. in Theatre are from the University of Alabama, Tuscaloosa, and her Master of Fine Arts degree is from the University of New Orleans. She is also a graduate of an accredited life-coach training program.
Joel Saltz MD, Ph.D.
Cherith Professor and Founding Chair, Biomedical Informatics, Stony Brook University
“Integrative Multi-scale Analysis in Biomedical Informatics”
February 17, 2015 | Trabant Theater | 10:30AM
Abstract: Integrative analyses of large scale spatio-temporal datasets play increasingly important roles in many areas of science and engineering. The recent work in this area by Dr. Saltz and team is motivated by application scenarios involving complementary digital microscopy, radiology and “omic” analyses in cancer research. In these scenarios, the objective is to use a coordinated set of image analysis, feature extraction and machine learning methods to predict disease progression and to aid in targeting new therapies.
Dr. Saltz will describe methods his group has developed for extraction, management, and analysis of features along with the systems software methods for optimizing execution on high end CPU/GPU platforms. He will also describe biomedical results obtained from these studies along with extensions of the computational methods to broader application areas.
Bio: Joel Saltz MD, PhD, is the Cherith Professor and Founding Chair of the Department of Biomedical Informatics at Stony Brook University. He also the Vice President for Clinical Informatics for Stony Brook Medicine and Associate Director of the Stony Brook University Cancer Center. Dr. Saltz is a leader in research on advanced information technologies for large scale data analytics and biomedical and scientific research. He has developed innovative clinical informatics systems including the first published whole slide virtual microscope system and leading edge clinical data warehouse frameworks. He has spearheaded several multi-disciplinary efforts creating cutting-edge tools and middleware components for the management, analysis, and integration of heterogeneous biomedical data. Dr. Saltz broke new ground with middleware systems that target distributed and high-end systems including the filter-stream based DataCutter system, the map-reduce style Active Data Repository and the inspector-executor runtime compiler framework.
Dr. Saltz served at Emory from 2008 until joining Stony Brook in 2013. At Emory he was founding Chair of the Department of Biomedical Informatics; Professor in the School of Medicine, Department of Pathology and Laboratory Medicine; the College of Arts and Sciences, Department of Mathematics and Computer Science; and the School of Public Health, Department of Biostatistics and Bioinformatics.
From 2001 to 2008, Dr. Saltz served as Professor and Founding Chair of the Department of Biomedical Informatics at The Ohio State University College of Medicine. He was also Associate Vice President for Health Sciences for Informatics, and he played important leadership roles in the Cancer Center, Heart Institute and Department of Pathology.
Dr. Saltz received his Bachelors and Masters of Science degrees in Mathematics at the University of Michigan and then entered the MD/PhD program at Duke University, with his PhD studies performed in the Department of Computer Sciences. He began his academic career in Computer Science at Yale, the Institute for Computer Applications in Science and Engineering at NASA Langley and the University of Maryland College Park. He completed his residency in Clinical Pathology at Johns Hopkins School of Medicine and served as Professor with a dual appointment at the University of Maryland and Johns Hopkins, serving in the University of Maryland Department of Computer Science and Institute for Advanced Computer Studies, and the Johns Hopkins Department of Pathology. Dr. Saltz is a fellow of the American College of Medical Informatics.
Srinivas Aluru, Ph.D.
Professor, School of Computational Science and Engineering, Georgia Institute of Technology
“Genomes Galore: Big Data Challenges in the Life Sciences”
April 30, 2015 | Gore Recital Hall | 10:30AM
Abstract: In just a little over a decade, the cost of sequencing a complex organism such as the human dwindled from the $100 million range to sub $1000 range. This rapid decline is brought about by the advent of a number of high-throughput sequencing technologies, collectively known as next generation sequencing. Their usage has become ubiquitous, enabling single investigators with limited budget to carry out what could only be accomplished by a network of major sequencing centers just a decade ago. This is leading to an explosive growth in the number of organisms sequenced, and in the number of individuals sequenced in search of important genetic variations. Next-gen sequencers enable diverse applications, each requiring its own class of supporting algorithms. This talk will highlight some of the big data challenges arising from these developments in the context of microbial communities, agricultural biotechnology, and human health.
Bio: Srinivas Aluru is a professor in the School of Computational Science and Engineering within the College of Computing at Georgia Institute of Technology. Earlier, he held faculty positions at Iowa State University, Indian Institute of Technology, New Mexico State University, and Syracuse University. He conducts research in high performance computing, bioinformatics and systems biology, combinatorial scientific computing, and applied algorithms. He pioneered the development of parallel methods in computational biology, and contributed to the assembly and analysis of complex plant genomes. Aluru is a recipient of the NSF career award, IBM faculty award, Swarnajayanti Fellowship from the Government of India, and the mid-career and outstanding research achievement awards from Iowa State University. He is a Fellow of the American Association for the Advancement of Science (AAAS) and the Institute for Electrical and Electronic Engineers (IEEE).
2013-14 Distinguished Speaker Series
Professor, Department of Computer Science, Dartmouth College
April 22, 2013 | Trabant Theater | 11:15AM
Abstract: From the tabloid magazines to main-stream media outlets, political campaigns, courtrooms, and the photo hoaxes that land in our email, doctored photographs are appearing with a growing frequency and sophistication. The resulting lack of trust is impacting law enforcement, national security, the media, e-commerce, and more. The field of photo forensics has emerged to help return some trust in photography. In the absence of any digital watermark or signature, we work on the assumption that most forms of tampering will disturb some statistical or geometric property of an image. To the extent that these perturbations can be quantified and detected, they can be used to invalidate a photo. I will provide an overview of the field of photo forensics and describe several cases studies of the application of these techniques.
Bio: Hany Farid received his undergraduate degree in Computer Science and Applied Mathematics from the University of Rochester in 1989. He received his Ph.D. in Computer Science from the University of Pennsylvania in 1997. Following a two year post-doctoral position in Brain and Cognitive Sciences at MIT, he joined the faculty at Dartmouth in 1999, where he is currently a Professor of Computer Science. Hany is the recipient of an NSF CAREER award, a Sloan Fellowship and a Guggenheim Fellowship.
The speaker series is free and open to the public.
Professor, School of Interactive Computing, University of Maryland
Efficiently Discovering High-Coverage Configurations Using Interaction Trees
April 22, 2013 | Trabant Theater | 11:00AM
Abstract: Modern software systems are increasingly configurable. While this has many benefits, it also makes some software engineering tasks, such as software testing, much harder. This is because, in theory, unique errors could be hiding in any configuration, and, therefore, every configuration may need to undergo expensive testing. As this is generally infeasible, developers need cost-effective techniques for selecting which specific configurations they will test. One popular selection approach is combinatorial interaction testing (CIT), where the developer selects a strength t and then computes a covering array (a set of configurations) in which all t-way combinations of configuration option settings appear at least once.
In prior work, we demonstrated several limitations of the CIT approach. In particular, we found that a given system’s effective configuration space—the minimal set of configurations needed to achieve a specific goal— may comprise only a tiny subset of the system’s full configuration space. We also found that this effective configuration space may not be well approximated by t-way covering arrays. Based on these insights we have developed an algorithm called interaction tree discovery (iTree).
iTree is an iterative learning algorithm that efficiently searches for a small set of configurations that closely approximates a system’s effective configuration space. On each iteration iTree tests the system on a small sample of carefully chosen configurations, monitors the system’s behaviors, and then applies machine learning techniques to discover which combinations of option settings are potentially responsible for any newly observed behaviors. This information is used in the next iteration to pick a new sample of configurations that are likely to reveal further new behaviors.
We have evaluated the iTree algorithm by comparing the coverage it achieves versus that of covering arrays and randomly generated configuration sets. We have also evaluated its scalability by using it to test MySQL, at 1M+ LOC database system. Our results strongly suggest that the iTree algorithm is highly scalable and can identify a high-coverage test set of configurations more effectively than existing methods.
(Joint work with Charles Song and Jeffrey S. Foster)
Bio: Adam Porter is a professor in the Deptartment of Computer Science,University of Maryland and at University of Maryland Institute for Advanced Computing Studies (UMIACS). He specializes in software development with interests in programmer productivity, quality assurance, and mobile application design (particularly mobile learning tools). He has been honored as Mobile Learning Fellow, 2010, UMD OIT/Center for Teaching Excellence and is an IEEE and ACM Senior Member. His PhD is from the University of California, Irvine, 1991.
Professor, Department of Computer Science and Lewis-Sigler Institute for Integrative Genomics, Princeton University
“Data-driven approaches for uncovering and understanding biological networks”
April 23, 2013 | 130 Sharp Laboratory | 2:00PM
2012-13 Distinguished Speaker Series
Principal Engineer and Manager Qualcomm Research
“System Software for Cloud Computing”
February 27, 2013 | Trabant Theater | 11:00AM
Abstract: Cloud computing went from “the IT fad of the moment” to being recognized as a revolutionary and effective approach to deliver computing services. In this talk we analyze cloud computing from the perspective of system software, exploring how this new model impacts current practices in operating systems and distributed computing. We identify a set of exciting research opportunities in resource management for cloud computing and discuss how cloud computing itself affects the way we carry out research projects.
Bio: Dilma da Silva is a Principal Engineer and Manager at Qualcomm Research in Santa Clara, California. Her prior work experience includes IBM T. J. Watson Research Center, in New York (2000-2012, where she managed the Advanced Operating Systems group and was Principal Investigator in the Exascale Collaboratory at IBM Dublin Research Lab) and University of Sao Paulo in Brazil (1996-2000 as an Assistant Professor). Her research activities have been around scalable and adaptive system software, more recently focusing on cloud computing. She received her Ph.D in Computer Science from Georgia Tech in 1997. She has published more than 70 technical papers. Dilma is an ACM Distinguished Scientist, an ACM Distinguished Speaker, a member of the board of CRA-W (Computer Research Association’s Committee on the Status of Women in Computing Research), of the CDC (Coalition for Diversifying Computing) board, a co-founder of the Latinas in Computing group, and treasurer/secretary for ACM SIGOPS.
Professor, Department of Computer Science Princeton University
“Frenetic: A Programming Language for Software Defined Networks”
March 14, 2013 | Trabant Theater | 11:00AM
Abstract: Today’s network administrators must manage their networks through closed and proprietary interfaces to heterogeneous devices, such as routers, switches, firewalls, and load balancers. During the past several years, the emergence of Software Defined Networking (SDN) has led to cleaner interfaces between network devices and the software that controls them. In particular, many commercial switches support the OpenFlow protocol, and a number of campus, data-center, and backbone networks have deployed the new technology. Yet, while SDN makes it possible to program the network, it does not make it easy.
In our Frenetic project, we are raising the level of abstraction for programming OpenFlow networks. Frenetic supports seamless composition of multiple tasks, such as routing, access control, and traffic monitoring, by automatically generating switch-level rules that enforce all of the policies. A simple “see every packet” abstraction shields programmers from reasoning about asynchronous events and complex timing issues in the underlying network, while the run-time system ensures data packets stay in the “fast path” through the underlying switches. Frenetic also transitions the switches from one network-wide policy to another, while ensuring consistent handling of all traffic during the change. These abstractions enable programmers to write OpenFlow programs that are short, simple, and efficient.
Frenetic (http://www.frenetic-lang.org) is a joint project with Nate Foster (Cornell), Dave Walker (Princeton), Michael Freedman (Princeton), Rob Harrison (US Military Academy), Chris Monsanto (Princeton), Josh Reich (Princeton), Mark Reitblatt (Cornell), Cole Schlesinger (Princeton), and Alec Story (Cornell).
Annie I. Anton
Professor and Department Chair School of Interactive Computing, Georgia Institute of Technology
“Designing Software Systems that Comply with Privacy and Security Regulations”
April 22, 2013 | Trabant Theater | 11:00AM
Abstract: Properly protecting information is in all our best interests, but it is a complex undertaking. The fact that regulation is often written by non-technologists, introduces additional challenges and obstacles. Moreover, those who design systems that collect, store, and maintain sensitive information have an obligation to design systems holistically within this broader context of regulatory and legal compliance.
There are questions that should be asked when developing new requirements for information systems. For example, how do we build systems to handle data that must be kept secure and private when relevant regulations tie your hands? When building a system that maintains health or financial records for a large number of people, what do we need to do to protect the information against theft and abuse, keep the information private, AND at the same time, satisfy all governing privacy/security laws and restrictions? Moreover, how do we know that we’ve satisfied those laws? How do we monitor for compliance while ensuring that we’re monitoring the right things? And, how do you accomplish all this in a way that can be expressed clearly to end-users and legislators (or auditors) so they can be confident you are doing the right things?
We’ve been working on technologies to make these tasks simpler, and in some senses, automatic. In this talk, I will describe some of the research that we have been conducting to address these problems.
Bio: Dr. Annie I. Anton is a Professor in and Chair of the School of Interactive Computing at the Georgia Institute of Technology in Atlanta. She has served the national defense and intelligence communities in a number of roles since being selected for the IDA/DARPA Defense Science Study Group in 2005-2006. Her current research focuses on the specification of complete, correct behavior of software systems that must comply with federal privacy and security regulations. She is founder and director of ThePrivacyPlace.org. Anton currently serves on various boards, including: the U.S. DHS Data Privacy and Integrity Advisory Committee, an Intel Corporation Advisory Board, and the Future of Privacy Forum Advisory Board. She is a former member of the CRA Board of Directors, the NSF Computer & Information Science & Engineering Directorate Advisory Council, the Distinguished External Advisory Board for the TRUST Research Center at U.C. Berkeley, the DARPA ISAT Study Group, the USACM Public Council, the Advisory Board for the Electronic Privacy Information Center in Washington, DC, the Georgia Tech Alumni Association Board of Trustees, the Microsoft Research University Relations Faculty Advisory Board, the CRA-W, and the Georgia Tech Advisory Board (GTAB). Prior to joining the faculty at Georgia Tech, she was a Professor of Computer Science in the College of Engineering at the North Carolina State University. Anton is a three-time graduate of the College of Computing at the Georgia Institute of Technology, receiving a Ph.D. in 1997 with a minor in Management & Public Policy, an M.S. in 1992, and a B.S. in 1990 with a minor in Technical and Business Communication.