Jillian Wallis is a PhD student in the Graduate School of Education & Information Studies at UCLA. Her research addresses the data practices of researchers at the Center for Embedded Network Sensing, and developing systems for the effective distribution and use of sensor data. She will be teaching a class for Library Juice Academy next month, titled, “Data Management.” Jillian agreed to do an interview here, to give people a better idea of what the course is about and what it might do for them, as well as a bit more about herself.
Jillian, thanks for agreeing to do this interview. I’d like to start by asking, why a data management course for librarians?
There is ever increasing policy pressure on researchers who produce data to make them available for reuse. In order to be reusable data need to be findable, intact, and interpretable. Mainly these come down to being well preserved with thoroughly applied metadata, two things with which librarians have quite a bit of experience! Librarians may not have all the subject knowledge to assist researchers in applying necessary metadata, but they are in a good position to find applicable metadata schema, or data repositories which will keep data safe. Assisting with data management and even the writing of data management plans are new ways that librarians can help their patrons. Librarians all over the world are working on open access policies, data management training modules, data management planning tools, as well as data citation and identification standards. We definitely have something to bring to the table!
So how will your course prepare people to work as data managers in their academic libraries? What does the class cover?
Research data are a different animal from most library materials, they need to be handled differently as a result. The course is designed to introduce librarians to data and the complexities of managing, sharing, and reusing these materials. We will focus on social and technical aspects of data management, such as the incentives and disincentives researchers face, to the standards for data identification. Students will get a chance to draft a data management plan like those submitted by a researcher to their funder and think critically about their own institution’s data management policy. Knowing what the researchers are going through will make librarians sympathetic to the researchers’ needs and allow them to find new opportunities to assist in the data management process.
To what extent are librarians called on presently to handle a data management role? Is it something that many librarians are called on to do, or is it more of an opportunity for potentially expanding the role of the library on campus?
For the most part, institutions are retooling the jobs of science librarians and other subject specialists to assist in this area. But a few institutions are making new positions to handle data management help requests. Just today a position at Cornell, Research Data & Environmental Sciences Librarian, was posted, and the availability of positions like this is likely to increase with demand.
Is there a tie-in between this kind of data management role and things like “big data” and data-mining? Can librarians play a role in that field?
The ability to bring together vast arrays of data to answer new question is one of the reasons why policy at so many different levels are calling for the management of research data. Not every dataset is going to be useful in for data-driven science, but getting as much out there with good quality metadata, persistent identifiers, and whatnot will add to the pool of possibly usable data. Other reasons are scientific reproducibility, building comparable datasets that are broad through time as well as space, and getting the most out of our publicly-funded research.
How much of a stretch is it for a typical academic librarian to turn into a competent data manager? Will this class be enough to put a librarian in that position, or is it more about pointing them toward further training and research?
Data management, or at least the long-term preservation part of it, will happen at institutional and domain repositories (like GenBank or Dryad). Librarians will not need to be able to perform that level of data management in order to make a difference in their community. Upon completion of this course, an academic librarian could assist patrons with data management plans, provide them with a variety of long-term options, and perform educational outreach to the data producers in their community. This course will function as a first step for those who eventually want to end up at an institution that is really pioneering data management services, like the UC System’s California Digital Library or the Johns Hopkins Data Management Services group.
Thanks, that gives a clear idea of what the course is about and what it will prepare participants to do. I’d like to close by shifting gears and asking a fun question. If you could teach any other class for Library Juice Academy, what would it be?
I developed a workshop for Library of Congress’s continuing education program a few years back on controlled vocabulary design, that was always really fun to teach. The whole thing was very hands-on, with a ton of group work and presentation. And by developing their own controlled vocabularies, students came out of the experience empowered to think critically about the controlled vocabularies they worked with. I still haven’t quite figured out how to translate this to an online teaching environment, but there is hope!
Well that sounds very interesting. We can talk about possibly doing that down the road.
Thanks for the interview. I’m glad for the opportunity to let people get to know a bit more about your class.
Thanks for the opportunity and I am looking forward to the course!