This wiki space contains archival documentation of Project Bamboo, April 2008 - March 2013.

Skip to end of metadata
Go to start of metadata

Store, Archive and Preserve


This theme covers a spectrum of scholarly activities undertaken against research materials, from simple digital or analog storage, use of repositories and other library or museum services, aspects of "data" curation, and longer-term archiving and preserving objects of scholarly interest.  Many other activities assume the presence of a storage capability, so that Gathered materials persist for the duration of the research project and possibly beyond.  In fact, some scholarly activity has as its end-goal the creation of a curated store of materials that can then be Shared or Presented.  So storage seems fundamental to the use of research materials.

Some storage is temporary, such as one's Zotero collection.  The drive towards digitizing materials and creating new "born-digital" works provides new challenges for Archiving.  Archiving brings the long-term preservation aspect that Store may or may not include; see the OAIS Reference Model for details.  Preserving objects of cultural interest to arts and humanities scholars has been a central activity of libraries, museums, and collectors for centuries.  Capability to preserve cultural objects in digital formats -- addressing storage capacity; accessibility; and frequent churn in digital formats, media, and tools that turn bits into humanly-recognizable artifacts -- is a core requirement of digital scholarship.

This theme is a merger of Store, Archiving, and Preserve objects of scholarly interest.




Proposed/originated by:

Jim Muehlenberg ("Store")

Univ. of Wisconsin, Madison

Proposed/originated by:

Quinn Dombrowski ("Archiving")

University of Chicago

Proposed/originated by:

Steve Masover ("Preserve objects of scholarly interest")

UC Berkeley

Current facilitator(s)



Back to Identify Themes page...

What tools, standards, organizations, or efforts exist in this area of scholarly practice?


Description - what is it?

URL or other reference


"DSpace captures your data in any format - in text, video, audio, and data. It distributes it over the web. It indexes your work, so users can search and retrieve your items. It preserves your digital work over the long term." (quoted by Steve Masover)

Fedora Commons

"Fedora Commons provides sustainable technologies to create, manage, publish, share and preserve digital content as a basis for intellectual, organizational, scientific and cultural heritage [...]" (quoted by Steve Masover)


"Nuxeo 5 is a robust, extensible, global, standards-based Enterprise Content Management (ECM) solution available as Open Source Software (OSS). Nuxeo 5 is based on other open source software, notably from the Apache Foundation, the JBoss Group (a division of Red Hat) and the Eclipse Foundation, and developed in a truly open manner." (quoted by Steve Masover on referral from Patrick Schmitz)


An OASIS Charter Proposalfor Content Management Interoperability Services (CMIS)  submitted on 10 Sep 2008 is intended to allow content management systems from different vendors to interact, providing greater flexibility for enterprise customers. Proposers included folks from EMC, IBM, Oracle, OpenText, Alfresco, and Microsoft. In the current scope it is notintended to cover Digital Asset Management use cases, but is probably worth paying attention to. (posted by Steve Masover on referral from Patrick Schmitz)

CMIS Charter Proposal


"In consultation with large data producers and managers, DANS laid down what those requirements need to be in its Datakeurmerk (Data Seal of Approval)" (Steve Masover, from referral by Chad Kainz)

datakeurmerk (


Preservation towards storage and access. Standardised Practices for Audiovisual Contents Archiving in Europe. / The objective of the project is to provide technical devices and systems for digital preservation of all types of audio-visual collections.

What tools, standards, organizations, or efforts are missing from this area of scholarly practice?


Description - what is it?

URL or other reference

sound_byte_name_or_description (your_name)

summary_description (your_name)

What part of this area of scholarly practice is within Project Bamboo scope, and why?


Description - what is it?

Why is it in scope?

sound_byte_name_or_description (your_name)

summary_description (your_name)

explanation_of_why_in_scope (your_name)

What part of this area of scholarly practice is outside Project Bamboo scope, and why?


Description - what is it?

Why is it out of scope?

Building repository infrastructure (Steve Masover)

Building repository systems that perform the basic functions of storing digital objects.

This is a well-populated area of work (cf. dspace and fedora-commons) that Bamboo doesn't need to replicate


References (e.g., material from Workshop 1 notes or flipcharts)


  • "scholarship in hum. is seen as cumulative.  tool building and use which is not cumulative is problematic" (ex. 3 scribe notes, 1d-F)
  • "publishing was in two forms, books and articles.  now there's creation of databases.  story of a colleague with db of 5000 women writers in china, with references to the writings.  new librarian decides 'we don't want to support this'.  what's the clearinghouse to support dissemination if a project loses support?  we need to solve the dissemination problem." (ex. 3 scribe notes, 1d-F)
  • "Finding a secure, persistent place for storing resources [...] Should be like Library of Congress" (ex. 2 scribe notes, 1a-B)
  • "How can PB manage/refere[e] ownership/responsibility issues for the sustainability of the enterprise?" (ex 1 scribe notes, 1b-B)
  • "we would like to hang on to everything for posterity, but that sticks posterity with the problem of picking out what's valuable ... letting a thousand flowers bloom vs. existence of evaluative criteria"(ex. 5 scribe notes, 1b-B)

Steve Masover

Cathy Marshall's paper From Writing and Analysis to the Repository: Taking the Scholars' Perspective on Scholarly Archiving"focuses on the kinds of artifacts the researchers create in the process of writing a paper, how they exchange and store materials over the short term, how they handle references and bibliographic resources, and the strategies they use to guarantee the long term safety of their scholarly materials. The findings reveal: (1) the adoption of a new CIM infrastructure relies crucially on whether it compares favorably to email along six critical dimensions; (2) personal scholarly archives should be maintained as a side-effect of collaboration and the role of ancillary material such as datasets remains to be worked out; and (3) it is vital to consider agency when we talk about depositing new types of scholarly materials into disciplinary repositories."

Steve Masover

Steve Masover

  • Finding a secure, persistent place for storing resources.  Should be like Library of Congress Minerva project. Academic presses are thinking about this as well.  (Ex. 2 scribe notes, 1a-B)
  • Archives of conference papers, electronic working papers like the sciences do. Registry of research activities as part of this social networking activity -- pre-print, awareness of area of work and particular argument, approach, methodology.  A 10-page searchable "visiting card". ... Store and retrieve personal research, being able to find what you've already digested somewhat and make sense of it. ...  Store recordings into repository. ...  Collating previous work.  Has both a mechanical and archival quality as well as a summary and reflective side. ... Updating current dossier of work. (Ex. 2 scribe notes, 1a-C)
  • B9: Submit what one has done to an archive so others can see it. Not just in a large, structured appendix.  One's university library might not take these extra materials, depending on policy. But it's hard to put one's research materials in coherent order for presentation to a library ... there's no credit in doing this work ... the library might not know how to take care of it ... what's a suitable archive in which materials can reside?  (Ex. 2 scribe notes, 1b-C)
  • E1:  most of what we do are e-repositories. ... E2:  Informed ingestion of primary materials into a repository - allowing faculty to produce e.g. scholarly editions w/o the need to learn e.g. TEI. ... E2:  a repository of "dead ends".  journal of digital disasters. (Unsworth).  (Ex. 2 scribe notes, 1b-F)
  • 10 - the mind is the data store.  (Ex. 2 scribe notes, 1d-E)
  • Hold research in live repository.  (Ex. 2-3 flipcharts, 1c-E)
  • Digital curation.  (Ex. 2-3 flipcharts, 1d-C)
  • Exit strategy. Archiving/migratable - ex. TEI - designed for preservation. ...  Ex. Data Curation at Hopkinswith astrophysicists.  Bamboo - could organize data curation communities.  (Ex. 2-3 flipcharts, 1d-K)
  • People used to have mentality of as things came across their desk, they'd decide keep/not keep. Now people want to keep everything so they can later do whatever.  Need metadata, huge storage department - people want huge e-mail quotas.  (Ex. 6a scribe notes, 1c-C)

Jim Muehlenberg

  • conference in may on supplementary materials for journal articles, talk of trying to get data sets out of cancer researchers, faced resistance about turning over data sets.  journals should require turnover of datasets for publication.  code of best practices proposed.  as editor, do i have to edit datasets now? (Ex. 6b scribe notes, Ex 1d-D)
  • publishing was in two forms, books and articles.  now there's creation of databases.  story of a colleague with db of 5000 women writers in china, with references to the writings.  new librarian decides "we don't want to support this".  what's the clearinghouse to support dissemination if a project loses support?  we need to solve the dissemination problem. (Ex. 3 scribe notes, 1d-F)

Steve Masover

  •  Very clear that once content goes digital, distinction is magnified. Once in digital form, items continue to have meaning, but the meaning may be altered. One can argue that the narrative form was optimized for codex, but in electronic form approaches change. Search, index. (Ex. 1 scribe notes, 1c-B)
  • Ultimately [project] may well end with archiving, preserving, and distributing the end results. (Ex. 2 scribe notes, 1c-A)
  • Putting archives online is new work. (Ex. 4 scribe notes, 1b-E)
  • they have this problem too in ling; there is one ontology becoming more and more accepted the task of the researcher isn't to reinvent ideology but to provide a mapping and another problem is search: approaching a situation where no one has to throw anything away so how do you find the valuable pieces of information when you archive everything? (Ex. 4 scribe notes, 1b-B)
  • web archiving: policy constraints and restrictions (Ex. 5 scribe notes, 1b-B)
  • "Linguists talk about archiving stuff, when they mean slapping it up on a website, and that's not archiving, in my view." (Ex. 5 scribe notes, 1b-B)
  • stuff like reviews moving from journals to the web; will that make book reviews less considered for tenure and promotion questions? And are they being archived by libraries? (Ex. 5 scribe notes, 1b-B)

Quinn Dombrowski

Back to Identify Themes page...

1 Comment

  1. Unknown User (nls36)

    There are many excellent tools and approaches identified here but just somc concerns in terms of who will provide leadership, stewardship and co-ordination of preservation stategies for the project?

    Subject to existing metadata standards, which help to support the implementation work, how can we work towards developing some specific co-ordinated data management strategies which address the digital preservation needs of Arts and Humanities research?

    Will the expectation be that local digital preservation and hosting strategies at partner level be implemented or should we not be working to have these these  co-ordinated at national/international scale? What are the overheads for local solutions?