Please join us for the next Research IT Reading Group, on Thursday 11 August, as we discuss the Data Acquisition and Access Program (DAAP), a joint effort by the Library and the D-Lab.
When: Thursday, August 11 from 12 - 1pm
The Data Acquisition and Access Program (DAPP) pilot project, completed in the Spring 2016 semester, sought to create a data acquisition and access process via a collaborative partnership between the library and the D-Lab. The program was inspired by requests for access to data from across the University for unique restricted data and modeled after similar efforts at other universities. This partnership is grounded in a division of responsibility in which the library is responsible for marketing of the program, licensing/acquisition of data, acquiring data and supporting discovery of datasets. The D-Lab is responsible for storage and access auditing (e.g. user agreements). Please join the reading group to hear more about these types of programs at other institutions and discuss the potential role of this program at UCB.
Prior to the meeting, please review the following:
Erik Mitchell, Associate University Librarian for Digital Initiatives and Collaborative Services
Jean McKenzie, Acting Associate University Librarian for Collections
Facilitating: Chris Hoffman, Research IT
Aaron Culich, Research IT
Alex Ivanoff, BIDS undergrad
Aron Roberts, Research IT
Barbara Gilson, SAIT
Camille Crittenden, CITRIS
Claudia Von Vacano, D-Lab and Digital Humanities
Elliot Smith, Biosciences and Chemistry Library
Jamie Wittenberg, Library and Research IT
Jason Christopher, Research IT
Jim Church, Library
Mark Hemhauser, Library
Maurice Manning, Research IT
Nico Tripcevich, Archaeology
Patrick Schmitz, Research IT
Ron Sprouse, Linguistics
Scott Peterson, Doe Library
Steve Masover, Research IT
Steven Carrier, School of Education
[see slides (PDF)]
MARC not a good standard to describe data. Dataverse represents data very well, but heavyweight in terms of record creation. So it's in OSKICAT but is easier to find through D-Lab.
Pilot project has five data sets, ranging in licensing cost from $1000 to $8000 (approximate). Guideline of $5000 top for cost; not a hard cap, but Library might ask for co-investment for data sets on the expensive end of the spectrum -- e.g., where a grant pays for some part of the licensing, or a department.
Idea (PLS): offer researchers in-kind support for grants that request funding for a data set that otherwise meets the Library's acquisition criteria.
Jean: Postcards on order describing this program to incoming students and faculty as the academic year kicks off. Not a lot of info, but a URL for next steps in making a request. Requests can be made anytime; reviewed once per quarter (e.g., Aug 1, Nov 1)
Claudia: Would appreciate a one-pager that describes/clarifies how the program works.
Example: a data request that the instructor of a course says will be folded into a curriculum. Potential to be used by / benefit hundreds of students.
Future work: exploring discovery platforms we might adopt to: Dataverse is one possibility but, again, perhaps too elaborate; DASH is for open-data; Library does want to make these data sets more discoverable than they are through OSKICAT records.
Dataverse might be appropriate if there were interest in cataloging existing data sets that can be made more available than to the single researcher or group that acquired or generated it.
Elliot: Accelerated review possible when deadlines are faced (in instructional or grant-writing process)? (Erik and Jean: good, interesting suggestion)
Claudia: To Elliot's point -- yes; but also defined periods of review are helpful. Possible partnership with DH program, encouraging humanists to acquire data sets.
David: How will researchers on campus find data sets available on campus?
o this is the big question
o D-Lab is working to improve discovery of its data sets, but those aren't all the sets
o Alex described a proposal to leave it to domains, and appropriate subject librarians, rather than an all-encompassing catalog. Let subject librarians be the links to help campus people navigate the likely data sets and/or likely
o Campus people who find their data sets on Google will discover an ability to purchase before they discover that the campus has already purchased it.
Aaron: Data in the wild not often ready for use in a research context. There's a curation need here. It would be good to engage with the library to transform data in ways that are necessary as a precursor to research.
David: what if I am a researcher who wants to publish my data, to comply with regulatory requirements or otherwise. What services does campus have to help with that.
o Camille: CITRIS generates an enormous amount of data...
o Jamie: depends on who you are, why you want to publish, whether there's a domain-centric site to publish, etc.
Patrick: Licensing beyond one campus? Systemwide opportunities?
Jean: No conceptual reason why not, but haven't gotten to that use-case yet.
Patrick: What about licenses that require a subscription, annual payments or similar
Jean: Some happen like that. DAAP program guidelines are to only pay first year in these cases.
Jim: [gave an overview of the types of licenses and licensors in play]
Chris: Will already-acquired data sets be moved into the platform(s) that support DAAP?
Mark: No single answer, depends on licensor / terms.
Jamie: Are we assuming that all data requested will be digital [network-transmitted]? What if it comes in on tapes, or CD?
Mark: Haven't dealt with a need to transform to different format
Erik: If we receive data on physical media we have to pay sales tax on it...
Erik: Welcome additional questions, ideas, requests!
Alex: offering "modules" -- data sets and weeklong orientation to them that any faculty member on campus can bring students to it