Navigation:
Documentation
Archive



Page Tree:

Child pages
  • Classics use case for curation of digital materials

This wiki space contains archival documentation of Project Bamboo, April 2008 - March 2013.

Skip to end of metadata
Go to start of metadata

This use case was developed by Bruce Barton, in consultation with Bridget Almas, Steve Masover, and Travis Brown.

 

Summary

Improve the quality of holdings in the Perseus Collection by allowing a scholar to retrieve an object through the Bamboo CI HUB, edit the object, and then return the object to the Perseus Collection through the Bamboo CI HUB.

Notes

Scholars use common desktop tools to emend and annotate the object. They may also use morphological analysis tools to decorate the object with part-of-speech (POS) or other markup. Annotations may include assertions about the relationship between an object and other objects.

Perseus will accept and store curated versions of an object, annotations to a curated object and curation logs that describe what changes were made to the object along with who made the changes, when they were made, and so on.

The Perseus Collection repository does not lock or check out an object to the scholar when the object is retrieved from the Perseus Collection repository. Other scholars may have simultaneously retrieved, curated, and returned revisions of the same object to the Collection. We include the version of the object originally retrieved when we transmit the curated object to the Perseus Collection repository so that the repository can reconcile versions and manage change conflicts.

We assume that the work of preparing improved versions of texts for Perseus is a loosely collaborative affair. As scholars curate objects, they seek the advice of colleagues or advisors; they reach points in their work at which they solicit wider review and comment.

Workflow Steps

Each step in the following representative workflow marks an action initiated by a user

  1. Create a (Zotero bookmark or other) reference to an object in the Perseus repository. The user will have identified the object using whatever discovery tools the Perseus Collection supports or will have obtained the reference from another source such as a scholarly publication. Bamboo does not provide a discovery interface for the Perseus Collection or for any other collection.
  2. Authenticate to TextShop through a third party Identity Provider (IdP).
  3. Submit the reference from step 1 to the CI Hub in order to retrieve an object from Perseus and place it into the CI Hub cache, where it can be inspected.
    1. Provenance metadata are created and attached to the object as it is copied to the CI Hub and normalized to the Bamboo Text Object model.
  4. Create a location in the TextShop local object store in which to place the object from Perseus and copy the object form the CI HUB to the local object store. Assign permissions to the object in local object store in order to share the object with collaborators who use this instance of TextShop to participate with scholar's working group.
    1. Provenance metadata are created and the stored in datastreams on the object as the object is copied to the local data store.
  5. Perform any transformations on the object to bring it into compliance with a representation standard and/or to prepare the object for editing and annotation. This step creates new derivative objects and associates provenance and transformation log data, if any, with them.
  6. Place the editing target version of the object in a local Git version control system for desktop access. This step initiates a check out process.
  7. Check out the desired version or derivative of an object to a desktop using a Git client.
  8. Edit/curate or annotate the object with the scholar's tools of choice, e.g. oXygen, ImageJ.
  9. Commit changed versions of an abject to the Git repository as desired.
  10. Add and commit standoff annotations as desired to the Git repository.
  11. Tag a version of the document and annotations in the Git version control system. Tagging an object triggers a commit to the local object store, generates change logs based on version diffs, and updates triple stores, in an summative fashion. The new versions, associated annotations, provenance metadata, curation logs, and metadata can now be made available to broader community of scholars, perhaps to invite comments or feedback in the form of further annotations on the object. The versions stored in the local object store can serve as stable addressable targets for annotations and references. Scholars may iterate steps 8 to 11 several times before the object is considered ready for redeposit in the Perseus collection.
  12. Create a deposit content package that includes the original version retrieved from the repository, the revised version, change logs, annotations, and provenance metadata, all of which are recorded in a manifest.
  13. Copy the deposit content package to the CI HUB.
  14. Commit the deposit to trigger placing the object back into the Perseus repository. (Although out-of-scope for Bamboo, Perseus unpacks the content package to extract items it can mount in its collection, reconciles change conflicts, and makes the improved text, annotations, logs, etc available through its collection interface.)
  15. Retrieve the deposit processing report and store locally with the object's provenance metadata.

Bamboo phase two capabilities supporting the use case

Version control on locally held objects objects.

We will support version control in the local Fedora object store.

Marshaling objects to and from the desktop

We also postulate here that the Git source code repository will be used as a means of marshaling objects to the desktop and facilitating collaborative work on the objects along the lines of Son of Suda Online (SoSOL). We retain CMIS/Fedora for compound/complex object modeling of the sort implied by the Bamboo Text Object. We speculate that version tagging in Git will provide a suitable point or hook for triggering updates to Fedora. In effect, we are distinguishing between a working store (Git) where components of objects, i.e. TEI XML documents or images, are organized and conveniently placed for desktop access, and a publication store (CMIS/Fedora), where object versions are modeled as Bamboo Text Objects and stably addressable.

Annotation

Objects and annotations are stored and exposed for querying. As we noted in step 11 above, the local object store provides a means of storing stable targets for annotations and references. Policies around the duration of an object's storage in the local object and around the level of access to it by scholars or the public are driven by the local needs of the work groups using the instance of TextShop and the resources made available to them through their sponsoring institutions. Bamboo does not control these policies.

Modeling

  • content package: This is a bundle of resources with a manifest for transmitting a curated object to a source repository for (re)deposit.
  • provenance: Retains references to objects and tracks provenance between revisions.
  • annotation bodies: Structure bodies to support mining.
  • curation logs: Structure logs to support mining.

CI HUB Flow reversal

We are postulating that this is accomplished through the transmission of a content package.

N-tier IAM

Let's imagine that the repository receiving a content package, which contains an improved version of a text, enforces AuthN and manages AuthZ by user. Then, if the Perseus Collection were to join the Bamboo Trust Federation, it could be in principle possible to allow the Perseus Collection to verify directly the identity of the user on behalf of whom the content package is being deposited and enforce whatever local policies it may have vis-à-vis that user. Perhaps some users enjoy a higher level of trust as curators with the practical consequence that their submissions receive a different level of scrutiny by the Perseus curators.

Discussion

In what senses is this a narrow use case? And how does being narrow in these senses reduce scope?

  1. Minimal tool integration in the RE. The heavy lifting in editing and annotation construction occurs outside of TextShop either in desktop tools or in remote tools such as the Alpheois Treebank Editor. Text POS decoration is provided through backend services delivered through the Shared Services Platform. Workflow UI/UX elements to invoke the backend services from within the RE are very simple.
  2. Single repository for redeposit of curated objects. Because a Bamboo partner controls the repository, we believe that we can rapidly experiment and iterate on content package modeling and transmit/ingest development.
  3. Limited range of annotations to model, store, and express: POS annotations for the Alpheois tools have already been modeled in standoff OAC RDF. Annotation targets and bodies have been modeled and are presently in use. This fact allows us to focus on the mechanics of annotation stores and transmission without having to first undertake a significant annotation modeling effort.

Because much of the subject specific work has been accomplished and because tool integration is very lightweight, we can focus on the infrastructure for scholarly data management. We don't have to undertake an expensive and time-consuming tool integration before scholars can be productive.

Building on the use case

To apply the work we've done to other use cases, we add client tools where needed. These may be annotation, editing, or analytical tools. Having bridged the gap between the Bamboo Ecosystem and the desktop, it may be that little or no change is required to broaden the application of the Bamboo capabilities to another area of study beyond deploying suitable facilitating transformations to prepare data for consumption by the new client tools.

We are breaking new ground with content packaging for revised object transmission. It may well be that other repositories will find that using the packaging we develop for Perseus is the path of least resistance.

  • No labels