When: Thursday, February 11 from noon - 1pm
Where: 200C Warren Hall, 2195 Hearst St (see building access instructions on parent page).
Event format: The reading group is a brown bag lunch (bring your own) with a short <20 min talk followed by ~40 min group discussion.
Presenter: John Lowe, Research IT
Facilitator: Chris Hoffman, Research IT
Museum data is sensitive in the many of the same ways that financial and medical data is sensitive. And when this data is stored in digital form it is vulnerable in the same way and may require the same safeguards. The value in making museum info available (either without restriction, i.e. publicly, or with restrictions, i.e. to researchers or via licensing) must be balanced against the interests and obligations of wider communities and the caretaking institutions. This presentation lays out a "taxonomy" of sensitive data types, with concrete examples mostly from UCB institutions (museums) using the CollectionSpace collection management system.
The types presented include:
Please review the following prior to our 2/11 meeting:
Presenting: John Lowe, Research IT
David Baxter, University & Jepson Herbaria
Michael Campos-Quinn, PFA
Steven Carrier, School of Education
Jason Christopher, Research IT
Dav Clark, BIDS/D-Lab
Aaron Culich, Research IT
Celia Emmelhainz, Anthropology Library
Barbara Gilson, IST-Admin Systems
David Greenbaum, Research IT
Jordan Jacobs, PAHMA
Rick Jaffe, Research IT
Steve Masover, Research IT
Scott Peterson, Doe Library
Aron Roberts, Research IT
Kelly Rowland, Nuclear Engineering / Research IT
Patrick Schmitz, Research IT
Nico Tripcevich, Archaeology
John White, LBNL-HPCS / Research IT
Jamie Wittenberg, Research IT
[opened with Smithsonian Museum video on the "ruby slippers" worn by Judy Garland in the movie "Wizard of Oz" – slides to be added soon]
Informal / anecdotal discussion about loosely-organized categories of museum data that requires security. "Admiring the problem" rather than proposing solutions or discussing procedures.
In general, the question of making museum metadata public is a new problem, in its infancy ... this is something museums are only now beginning to do.
Qualities of data that can render it sensitive and appropriate to secure from public view: financial, intellectual property, physical location, provenance (in the sense of field collection place, as in archaeological site, plant collection site), cultural information, mistakes / provisional information
CollectionSpace: ETL to make [partial] copies (2) of transactional database for search/display in a museum's (a) public interface (portal); and (b) users authenticated and granted permissions to view sensitive data in addition to public data.
IP issue: artist or someone else may retain copyright on digital reproductions of a work that is owned by a museum. "Fair use" of such images constrained by size/resolution.
Emeryville Shell Mound example vis-a-vis geolocation obfuscation: if you obfuscate using an algorithm, every object, you get a cloud of locations clustered around the real one (thus revealing it). Lesson: one must obfuscate the site and apply that obfuscation to the related objects, not "jitter" the location of the objects individually.
Rick: not just rules encoded in system, but people/behavior based rules as well -- do you grant access to volunteers? do you log into your CollectionSpace account when at a cafe? how do you train staff to implement rules properly?
David: Have there been breaches of museum collections (related to online information)? And if so, would a breach require notification of people whose information was being protected?
John Lowe: Not any breaches at Berkeley, AFAIK. Breached data incidents would need to be handled like any other sensitive data issue that the University would need to handle?
David: How does CollectionSpace facilitate implementation of data security decisions museums must make.
John Lowe: Original scope presumed all data in the collection was secured. Within that context, roles and permissions could be assigned. Public access issues handled via ETL and on case by case basis at this time.
Jamie: Geospatial obfuscation. Are there international standards about how to obfuscate? How to handle issues that 'insufficient' obfuscation won't obfuscate at all.
John Lowe: No international standards of which I'm aware.
David Baxter: Issue, e.g., if a geographic area has only one pond where a water lily might grow.
Nico: Can one see, for example, if the museum collection holds a lot of material from a geographical area, e.g., Marin Cty?
John Lowe: Yes. Example shown: PAHMA objects from Peru. Some just expose at nation level, some records expose regional location or even lat/long
Nico: OpenContext.org example. 20 km (minimum) grid display, color map.
Michael C-Q: APIs to CSpace exposed?
John Lowe: Yes, e.g., HackTheHearst. Only that one published to-date, but others likely to be published over time.
Nico: Delphi. Function to build up collection and share it. Analogy in CSpace?
John Lowe: Not at this time.
Patrick: We had some nice stories about people using Delphi function to build up a collection of items they wanted to view on a museum visit.
John White: Does obfuscation algorithm vary over time (and does that weaken obfuscation)?
John Lowe: No, same obfusction applied via ETL each night.
David Baxter: Accession numbers might reveal to a sharp observer information that is meant to be obscured: item 101 is at this location on a trail, 102 is secure, 103 is 0.5 miles further along the trail. Inferences can be drawn.