Most information is now published in complex, structured, evolving datasets or databases. As such, there is increasing demand that this digital information should be treated in the same way as conventional publications and cited appropriately. While principles and standards have been developed for data citation, they are unlikely to be used unless we can couple the process of extracting information with that of providing a citation for it. I will discuss the problem of automatically generating citations for data in a database given how the data was obtained (the query) as well as the content (the data), and show how the problem of generating a citation is related to two well-studied problems in databases: query rewriting using views and provenance.
Speaker Biography
Susan B. Davidson received the B.A. degree in Mathematics from Cornell University in 1978, and the M.A. and Ph.D. degrees in Electrical Engineering and Computer Science from Princeton University in 1980 and 1982. Dr. Davidson is the Weiss Professor of Computer and Information Science (CIS) at the University of Pennsylvania, where she has been since 1982, and currently serves as Chair of the board of the Computing Research Association.
Dr. Davidson’s research interests include database and web-based systems, scientific data management, provenance, crowdsourcing, and data citation.
Dr. Davidson was the founding co-director of the Penn Center for Bioinformatics from 1997-2003, and the founding co-director of the Greater Philadelphia Bioinformatics Alliance. She served as Deputy Dean of the School of Engineering and Applied Science from 2005-2007 and Chair of CIS from 2008-2013. She is an ACM Fellow, Corresponding Fellow of the Royal Society of Edinburgh, and received a Fulbright Scholarship and Hitachi Chair in 2004. Her awards include the 2017 IEEE TCDE Impact Award for “expanding the reach of data engineering within scientific disciplines”, and the 2015 Trustees’ Council of Penn Women/Provost Award for her work on advancing women in engineering.