Scheduled DB Maintenance: January 21st - 8:00 AM to 10:00 AM. Confluence will be unavailable during this time.

Navigation:
Documentation
Archive



Page Tree:

Child pages
  • Proposed Scope for the Bamboo Technology Proposal to Mellon RIT

This wiki space contains archival documentation of Project Bamboo, April 2008 - March 2013.

Skip to end of metadata
Go to start of metadata

1. Introduction

Purpose of this document  

This document describes the proposed scope of work for inclusion in the "Bamboo Technology Proposal" to the Research in Information Technology (RIT) division of the Andrew W. Mellon Foundation, and is intended to solicit input from institutions and organizations that have participated in the Bamboo planning process. The material contained herein will be incorporated into the complete "Bamboo Technology Proposal" that will be submitted to the Andrew W. Mellon foundation in early January 2010.

Relationship of "Bamboo Technology Proposal" to other Bamboo documents  

To begin, it may be helpful to clarify the relationship between this document and the work that had been done over the last several months. The Bamboo program staff has now shifted to writing the "Bamboo Technology Proposal" for submission to the RIT Division of the Mellon Foundation. This will be a one-year proposal that focuses on core enabling technologies for the humanities community and the Bamboo program. As many of you know, we had been drafting the "Bamboo Implementation Proposal (BIP)," a two-year proposal for both the technology and community components of Project Bamboo (the last published version of this was 0.6 on August 3, 2009). We have stopped work on the BIP to focus effort on developing the proposal to fund the core enabling technologies for Bamboo. As such, we plan to rework the BIP so that it can become an overarching "Bamboo Program: Phase I" plan for the first three-year cycle of work, of which the Bamboo Technology Proposal will be a core element.

As a core piece of the phase I program, the Bamboo Technology Proposal enables several foundational capabilities that when assembled, create opportunities to share and connect projects, people, and services to address research and pedagogical needs. One model of this assemblage can be found in the example of the "Bamboo Commons," a core idea that emerged from and was endorsed by the community involved in the Bamboo Planning Project. Realizing the vision of the Commons involves many elements and in the Bamboo Technology Proposal, we focus on those software tools, services, and platforms essential to enable local communities of practice to connect together and both share and reuse services, resources, and content within the context of a larger interrelated ecosystem. This initial technical effort fits within the broader Bamboo Program: Phase I scope where, over time, the community may focus specific effort to "grow and sustain" the Commons beyond its enabling technologies - that is, one-time and on-going contributions of both content and data by many scholars, technologists, librarians, and the like can make the Commons into a living source of valuable information for the humanities and higher education.

The Commons is merely one community-wide example of leveraging the core enabling technologies of the Bamboo Technology Proposal. These same technologies can be reassembled and combined with other services to create shared tools and capabilities for both individual projects as well as other communities of practice.

What is in this "Proposed Scope" document 

This document defines the major technology deliverables for Bamboo in its first year. It specifies who these deliverables are for; benefits and costs/risks associated with these deliverables; and how the deliverables relate to each other and to other technology initiatives in the humanities.

We have outlined three major categories of deliverables:

  1. Product Deliverables are software tools built for use by the community ("end-users") to accomplish day-to-day tasks in arts and humanities scholarship and collaboration.
  2. Core Services Deliverables are web services designed to accomplish key computational and processing tasks for use by and with other applications, tools, and environments.
  3. Platform Deliverables include creating and deploying a services framework to integrate and expose web services and to run the product deliverables, as well as expressing standards, articulating practices, and publishing software developer tool kits to enable the contribution, consumption, and reuse of Bamboo technologies.

Product Deliverables are designed for use by all the major communities who are part of Bamboo -- scholars, collection curators, librarians, technologists -- and are built as out-of-the-box, usable applications and capabilities. Core Services Deliverables and Platform Deliverables are designed for use by (but not limited to) researchers, software developers, and enterprise IT staff, and the tools built and run by these scholars and technologists.

Within the Platform Deliverables section we include an architectural diagram to help picture where these elements are situated and their interrelationships (cf. Section 2.3.2.1). Starting with "end-users" -- that is, from the top of this stack diagram down -- Product Deliverables are made up of end-user interfaces, include applications and core services, and run on the platform. Core Services Deliverables include those utility and essential (core) services and capabilities that provide key aspects of software functionality. They can (and will) be used to power Bamboo's products and, most importantly, be made available as web services for use by other tools and initiatives. Platform Deliverables sit at the bottom of this stack as they enable a community of developers to build robust and re-usable services that can be deployed by IT organizations and data centers. In the Platform Deliverable section we provide an initial strategy for deploying the Bamboo Services Platform. This strategy should allow for the rapid initial deployment of Bamboo's products while enabling the ability to define, explore and refine the best and most appropriate modalities for long-term platform deployment in heterogeneous campus environments.

2. Bamboo Deliverables: High Level Descriptions

2.1 Product Deliverables

2.1.1 Enabling Technologies to Support Communities of Practice

Enabling consideration of digital methods in the Arts and Humanities in collaborative, on-line communities of practice.  Connecting local communities of practice to build a "Commons"

Primary Audience: Humanities Faculty, Librarians, Museum and Archival Curators, Information and Computer Scientists, Information Technologists

Year One Development: Faculty, librarians and museum curators, information scientists, and technologists need to be able to discover and share information about software services, tools, content sources, projects, and methods that can mutually inform and improve the use of technology to support scholarship. Information sharing of this kind often occurs in local "communities of practice" such as a department, a small group of related researchers, or a project team, but it does not easily move out and across institutions, disciplines, and project boundaries. Bamboo will build a set of software services and "widgets" (plug-ins) that will enable this information to be contributed in a variety of modes (discussion, review, rating, usage tracking, etc.); harvested from multiple sources where it already exists; aggregated; exposed within multiple, existing research and collaborative environments; cited via persistent URLs; and discovered through evolving sets of search and discovery services. A browser-based interface will enable broad and early participation in a "global" community of practice, for which the term "Bamboo Commons" was developed by participants in the Bamboo Planning workshops. At the same time, backing services and an initial set of integration "widgets" will enable evolution of campus and/or disciplinary groupings of scholars, technologists, and content stewards working together in communities of practice, which contribute to and harvest information from the global Bamboo Commons to degrees determined by each community and the individuals who participate in it. The services that enable this activity will also make possible formal recognition for engagement in digital scholarship by assigning and maintaining permanent, citation-friendly references to information-sharing activity ("persistent URLs").

Further Development: Deeper search and discovery services will evolve as supporting core services, such as those for Scholarly Profiles, are enriched. Services for community curation of ontologies applied to a growing body of information contributed by communities of practice, will enable expanded semantic categorization, search, and discovery, both over the Bamboo Commons and within individual on-line communities of practice. An expanding set of harvesting services and integration widgets will broaden both ingest of information and exposure of information contributed to the Commons and on-line communities of practice. Integration of a formal Bamboo Services Registry with services supporting the Commons and communities of practice will extend the field of consideration from scholarly method to technology built to support it. Bamboo's services for consideration and usage tracking will facilitate distribution, citation, and aggregation of uptake metrics for new products of scholarship. This will become increasingly important as humanities scholars develop and contribute new scholarly research services that encapsulate innovative research methodologies (algorithms) that can be applied against multiple data sets and corpora. For example, social network visualization services first developed for a community of Near Eastern Studies scholars in the Berkeley Prosopography Services project might be applied to a broader set of corpora than the Hellenistic Babylonian legal texts (cuneiform tablets) for which the services were initially created - perhaps visualizing social relationships in James Joyce's Ulysses in analogous ways to the space-and-time visualization described by Ian Gunn and Mark Wright in their paper "Visualising Joyce" (Hypermedia Joyce Studies, Volume 7, Number 1).

Benefits

Costs / Risks

  • Provides a scholar-/domain-/task- specific perspective on technology; shows value of technology in a scholar's context
  • Broadens awareness of what tools exists and what they can do
  • Enables better informed resource selection (tool, content, etc) based on trusted sources
  • Enables serendipity by exposing methodology of both digital and non-digital scholarly practice in a forum where these often-separate worlds overlap
  • Exposes and broadens awareness of technology when searching on non-technical terms/domains, e.g., links to relevant tools can be included in search results for prosopography
  • Enables education and/or dialog around technology in a bounded and trusted venue; this is not sourceforge; it is humanist-forge
  • Enables fruitful dialog and communication between humanists and technologists by providing humanities-specific use case narratives as reference points for discussion, deepening specification and making the process of specifying technology goals more efficient 
  • Exposes the existence of digitized content to scholars who might not otherwise find it easily
  • Provides a venue for expert technical review regarding reliability and durability of software services, which can be located and relied upon by scholars seeking advice on technology appropriate to their needs
  • Citations for contributions, in the form of permanent URLs, enable credit for contributions to be included in CVs, Promotion & Tenure review, and publications
  • Contributing to this type of information sharing is time-intensive; multiple incentive mechanisms will have to be deployed, tested, refined, and supported.  Incentives that are driven by faculty and other credit will be crucial.  In general, incentive issues will be addressed in the Bamboo Program Document.
  • Without clear boundaries around what kinds of information is being sought, and what adds value to the body of contributed material, this effort will be difficult to define, market, and incent; the danger is that this could become another failed registry project
  • People may not feel safe discussing certain topics publicly, for fear of seeming ignorant
  • There is no process in place -- and no certainty that a widely acceptable process can be devised -- to value contribution through senior/expert faculty review, and to curate contribution respectfully and responsibly
  • Important partnership relationship and scope relationships will need to be defined with other tools that provide related functionality.

2.1.2 Humanities Corpora and Curation Workspace

Flexible, extensible, customizable workspace and publication platform to manage, curate, and disseminate collections.

Primary Audience: Humanities Scholars; Museum, Library, Archival Communities; Principal Investigators and Developers of Related Tools

Secondary Audience: Central IT Groups concerned with Content and Collections Management Services

Year One Development: Atop the Bamboo Services Platform, a highly flexible, extensible environment to address corpora and collections management, curation, and scholarly dissemination will be built to provide a functional solution for scholars with a corpora to preserve, consider, curate, and share.  A services back-end will be accessed by a browser-based user interface built (or adapted) and maintained by Bamboo, but the architecture will permit alternate interfaces to evolve and/or be integrated with the backing services as need and resources permit, including the ability for projects to brand the curation workspace as their own. Persistence services will permit a user to direct that local or vended storage host primary or secondary (backup) copies of materials and metadata, allowing the convenience of a hosted application and the security of owner-governed storage. Many projects in the digital humanities have fundamental needs that will be addressed by this type of environment: a workspace to gather and build digital collections of primary source materials; to collaboratively annotate and curate these collections; to publish materials to the web; to apply innovative services to analyze and visualize the collection; and to pass content to publishing platforms and to repositories for storage and preservation. These needs are faced by scholars and students alike. Environments such as ARTFL, Perseus, NINES represent different disciplinary approaches to this functional problem space. SEASR, Omeka, HUBzero, Fluid Engage, CollectionSpace and a host of open-source CMS and ECM offerings (Nuxeo, Alfresco, DuraSpace, Drupal, Joomla, to name several) address important aspects of this problem space in frameworks intended to support a variety of disciplinary, scholarly, or general use cases. Bamboo's offering will build on available functionality, delivering back-end services that can be accessed through a browser-based UI tailored for broad applicability and uptake among humanists, but which may also be integrated into new and extant solutions that wish to augment or 'outsource' offered functionality through consumption of Bamboo-hosted services. Initial development will focus on building out the platform and service enabling components for this Corpora Workspace. Bamboo will borrow substantial architectural work, platform and services development, and design methodology from the Mellon funded CollectionSpace project to jump start this activity. We also propose to work with a number of other core technology projects who are Bamboo partners, such as SEASR, Perseus, etc., to realize shared software components.  Finally, we expect primary end user functionality to include basic corpora building, collaborative description and annotation, and web-site publishing.

Further Development: We expect that in later years more powerful and flexible corpora curation services will be added so that the workspace grows in value and benefits a wider range of disciplines and scholars who want to perform more complex tasks; that we will have substantial more functionality to connect and pass off data to other tools and platforms; that a number of projects will be in the midst of migrating functions from their existing tool set to the Bamboo Corpora Workspace; that we will have defined boundaries, and in some cases conjoined project efforts, between Bamboo Corpora and Curation workspace and existing and new humanities initiatives, such as CollectionSpace, Sakai, and perhaps new publishing platforms being developed in partnership with academic presses.

Benefits

Costs/Risks

  • Provides an easy to use environment for humanities scholars, with varying levels of functionality that can be turned on and off
  • Provides a concrete, out-of-the-box product that Bamboo delivers to both scholars and to the IT / Libraries / Museums
  • Provides an easy to run service (or to pay for as a hosted service) for many campuses built on a services platform
  • Provides a vetted, community-designed, user experience-centric means of addressing a broadly-applicable set of use cases
  • Provides a sandbox, a tinkering place, for researchers 
  • For those unable to establish appropriate relationships with their institution's IT division, or who don't otherwise have access to services that address collection and curation use cases, this deliverable provides a means of accomplishing core tasks in digital scholarship via services that are tied neither to institutional support nor to local, tech-savvy support staff
  • Scholars who have access to institutionally-based storage, or who can purchase vended storage, will benefit from a hosted, services-based curation environment without the risk of outsourcing responsibility for digital preservation
  • Can be built by reusing work done in a number of key projects, including CollectionSpace, SEASR, ARTFL, Perseus, Berkeley Prosopography Services, and others.
  • Barring ability to adopt heavily from an established set of services, this deliverable may be too much to accomplish in Bamboo's first year given other deliverables
  • If collection management services include provision of storage for collections, costs and sustainability may present significant risks
  • Where services like these are already being offered by institutions, Bamboo would siphon funds away from institutional libraries and/or central IT units, risking 'turf' issues
  • Given the variety of corpora curation, analytic, preservation, and/or publication platforms that already exist, Bamboo may be addressing a need that has sufficient coverage

2.2 Core Service Deliverables

In general, development of core services in the first year will be limited to those services that directly enhance or enable the creation of the Product Deliverables above. These core services will, however, be developed with close attention to their more general applicability in future development cycles, and to design and engineering requirements that will enable broader use.

2.2.1 Scholarly Profiles

Self-managed expression of academic interests and history to facilitate networking, discovery, and "publish once - disseminate widely" maintenance of biographical and bibliographic information.

Primary Audience: Faculty and other scholars

Secondary Audience: Institutions and Learned Societies

Year One Development: To discover and form rich on-line networks for collaborative consideration, individuals and groups must publish information that enables and refines the ability of participants to find and interact with each other based on reputation, trust, activities, commonalities, and differences. Profile models enabled by Bamboo services will focus on information that supports participation in communities of practice and the publication of humanities collections. Profile elements in this initial stage of evolution will include (but may not be limited to) research and teaching interests; institutional and organizational affiliations; publication citations; associations with other Bamboo groups, participants, and communities of practice; and inventories of contributions undertaken in/by communities of practice. Profile information will broaden and deepen over time to include attributes and structured data of specialized use to particular communities, such as higher education institutions or disciplinary societies.

Further Development: Scholars need tools and capabilities to simplify the publishing of information about academic activities and interests - publications, presentations, mentoring, etc. - and to share and/or expose this information to multiple environments and data sources (publish/maintain once; disseminate widely and seamlessly). Bamboo's long-term goals include development of a set of software services that enable a lightweight CV tool that can publish data into other tools / environments (e.g., learned societies, collaborative environments, social networks) and gather data from these and other sources. To this end, Bamboo will work with initiatives such as Sakai and BioSphere as well as those at member and partner campuses to identify and potentially define data models that are broadly useful and extensible; and to build services that can be integrated into a wide range of tools, sites, and environments, to disseminate and consume information as authorized by its owner. Harvesting services that enable Bamboo to ingest a scholar's records from locally-maintained CVs that conform to any of a number of common formatting standards will minimize the need for scholars to be burdened with 'double-entry' requirements. Interchange (import and export) with such sites, platforms, and locally-maintained data sources will be enabled through service APIs.

Benefits

Costs/Risks

  • This project helps to reduce the time and effort faculty spend on an important weekly or monthly activity. It reduces the "transaction costs" for faculty.
  • Automates update of CV/profile to multiple channels, broadening exposure
  • Allows profiles to be published in social network space
  • Disseminated CVs/profiles are more consistent and accurate across publication channels when built on same data source
  • High visibility: faculty are concerned about CV and similar representation of their work
  • Allows for rich content to be viewed/accessed alongside CV, e.g., presentations & links to publications
  • Permanent URL for profile data, independent of institutional affiliation
  • Allows for search in a scholarly environment and a bounded search-space: search terms and semantics will be known, a smaller pool of data (compared to the web) increases chance of being found.
  • Can strengthen Bamboo's connection to professional/disciplinary societies, due to their strong interest in profile data
  • Engages high percentage of faculty who may not be inclined toward digital engagement, and familiarizes them with Project Bamboo
  • Facilitates faculty response to frequent profile requests
  • Enables other projects to consider the incorporation of Faculty Event Tracking as a fundamental building block
  • Enriches contribution and usage tracking described in "Communities of Practice" and "Connector" deliverables by enabling aggregation of tracking data in cohorts based on deep description of scholar-participants
  • Supports institutional efforts to develop common Faculty Event tracking systems
  • It will be very hard to develop a data model, even for minimal profile tracking, that will please everyone; different disciplines and faculty have particular needs for how their activity is represented (as do learned societies and institutions)
  • There may not be large-scale interest for this service, as current practice is deeply ingrained:  faculty are used to regularly updating their CV and emailing it to requestors, and are less careful to update their web presence
  • Requires significant adoption to be an effective pool of searchable information; search of a small pool of scholars is not useful
  • If faculty do not keep their profiles updated, low search precision (bad search results) will follow. This could easily cause users to come away with a bad impression of the amount and kind of data Bamboo has access to
  • People who want their CV to be visible most likely have a web presence already, and may not see Bamboo profiles as adding value
  • Bootstrapping problem: until Bamboo is known and widely adopted there is low incentive for someone to make extra effort to contribute their profile.

2.2.2 Tools and Content Connectors

Building interoperability through connector services that translate between widely-used and standards-compliant formats.

Primary Audience: Faculty and other scholars; tool developers

Secondary Audience:  Librarians, Museum and Archival Curators, Information Technologists

Year One Development: To enable interoperabilty, services will initially focus on connecting Communities of Practice supported by Bamboo technology; collections hosted by Bamboo's Humanities Corpora and Curation Workspace; and publication platforms (e.g., Omeka) that natively support standards-compliant protocols for content and metadata harvesting (e.g., OAI-ORE and OAI-PMH). Digital collections managed in Bamboo's Corpora Workspace, and others that serve metadata via supported protocols, will be seamlessly integrated for consideration by Communities of Practice, enabling collaborative exploration of how collections can be used, enhanced, and applied to available services and tools (e.g., analytical algorithms). Additionally, a set of analytical services deployed by Bamboo and/or by projects governed and served by Bamboo Partner institutions will be made interoperable as interchange formats and available content-transformation services evolve.

Further Development: A growing body of tools and content-oriented platforms exist for supporting different functions in humanities scholarship - SEASR, ARTFL, MONK, Zotero, Sakai, CollectionSpace, DuraSpace, Omeka and more. Some interoperate, but in general important tools are not connected as well as they might be, hindering end users (humanities scholars) who, while working in one environment, want to use data and/or functionality from another. Bamboo, in collaboration with key tool developers, will design a set of services to act as 'brokers' of content and functionality, transforming idiosyncratic input and output messages, where necessary, to standards-based intermediate formats; and, where possible, providing web-services APIs to hosted functionality in order to relieve scholars of individual and redundant responsibility for understanding and installing a bewildering variety of technologies on local machines. Brokering integration will yield a key ancillary benefit, in that usage tracking can be more consistently recorded; this will provide citable measures of technology's impact on scholarship that enable innovators of digital tools, services, and interoperabilitiy standards to claim formal credit for contributing to humanist inquiry.

Benefits

Costs/Risks

  • For those scholars who are sufficiently motivated to use multiple tools without Bamboo's facilitation, this can save time, money, and energy by connecting tools together in ways that are easy and efficient for scholars
  • For those scholars who might not have expended the effort to utilize more than a single tool in the course of their work, this can enable deeper/newer kinds of research by removing a barrier to broader digital engagement
  • For those scholars who are not inclined to use digital tools at all, this provides an easier entry point because integration will be easier to effect directly and/or will be better documented for local support staff
  • As adoption grows, interoperability standards will acquire increasing gravity, thus further lowering interoperability barriers for future tools and services
  • As Bamboo connectors get built, standards become established between toolmakers
  • As resources become connected, more research is enabled that might not have been pursued in the past due to unacceptable time or effort requirements
  • Enables experimentation / new kinds of research by making it easier to apply the same tool to a range of content sources or a range of tools / services / algorithms to the same content source
  • Allows Bamboo to fill a transparent role by remaining agnostic as to the value of tool/service functionality: scholars decide for themselves
  • Allows for potential integration/interoperability of many Mellon-funded projects 
  • Usage tracking enabled by brokerage of interoperability will surface uptake and usage patterns that will implicitly describe value of technology and explicitly give credit to innovators
  • It is necessary to start somewhere, and selection of initial partners for integration efforts might frustrate others who expected or wanted to be first
  • Data models required by different tools, or among multiple content stores, may be so incompatible that it will not be possible to connect services, tools, content, or environments that are identified as high-value integrations by Bamboo's community of scholarship
  • To effect integrations, it will be necessary to gain significant trust, permissions, and/or support from a variety of tool, service, content providers; it is not certain that working with Bamboo will be seen as adding value to potential partnership projects
  • Projects whose primary focus is on building "connectors" between many tools have struggled in the past to bring substantial value to end-users

2.2.3 Deep Search and Collections Interoperability

Exposing, generating, and sharing small-collections metadata to enable deep and semantic search, as well as to broaden domains for federated search and enhance opportunities for enabling interoperability between collections.

Primary Audience: Libraries, Museums, Archives; Scholars

Year One Development: Bamboo services will enable holders / owners of collections of digitized content to express access permissions to materials and metadata on a common platform, at levels of breadth and granularity that meet the diverse needs of scholar-collectors. Bamboo will then aggregate and make available "deep" indices of content offered for this purpose (i.e., will create searchable indices of information that is not exposed on the public web, but for which access requires subscription, database access, or other credentials). Content holders / owners will govern the degree of exposure to the index and depth of display in applications, such as federated or semantic search, by setting access permissions to permit exposure of only those aspects and elements of collections that they wish to make visible to a scholarly or general community, or to a tightly-defined group of trusted colleagues. This exposure will enable a ferment of discovery and scholarly analysis by providing rich fodder for aggregated and federated search services across both individually-contributed and institutionally-scaled repositories of material. In its first year of implementation work, Bamboo will focus effort in this area on aggregation of metadata from small-collections hosted in its Humanities Corpora and Curation Workspace, collections hosted on platforms that offer standards-based interfaces for metadata harvesting, such as OAI-PMH, and contributions made in the context of Communities of Practice enabled by Bamboo services. An initial set of services to enable deep search across these collections will also be delivered in Bamboo's first year.

Future Development: Aggregated indices will enable applications that offer deep search among a broad range of small, scholar-maintained collections, to be federated with search of large content repositories such as JSTOR and materials held by the HathiTrust, providing to humanists a more seamless field for discovery. Existing aggregated or federated search portals in this vein include, for example, Europeana, which provides access to "the cultural collections of Europe"; and WorldWideScience.org, which serves scientists interested in materials across a variety of ~60 international science portals. In the long-term, as curation and integration of community-generated ontologies is enabled, semantic analysis of deep collections is expected to enable novel means of searching across such collections. For example, corpora might be used as search terms (e.g., "find materials in the German language similar to my collection of 18th century French marriage contracts"). As Bamboo's aggregated indices come to include metadata harvested from collections maintained and published by scholars using platforms as varied as Omeka sites, Collection Space deployments, Zotero-hosted collections of bibliographic materials and annotation, ECM platforms such as Nuxeo and Alfresco, and CMS systems such as Drupal or Joomla, deep and federated search is expected to open previously sequestered collections to discovery, inquiry, and collaboration. In each of a selected set of widely-adopted collection management and dissemination platforms, Bamboo will leverage standards based metadata interfaces, and/or contribute to the development of plug-in or extension technology that expresses restrictions specified by a content-owner and permits aggregation into search indices. Search and discovery services offered by Bamboo will be adoptable by library portals and platforms, by tools and environments such as Zotero or Sakai, and by disciplinary sites and platforms.

Benefits

Costs/Risks

  • Enables richer and more accurate search by including a broader range of materials in the search-space
  • Empowers scholars to share collections maintained on a variety of platforms, while retaining control over access and audience
  • Faculty are often motivated to put time and painstaking effort into research and analysis on newly-accessible content; this deliverable will increase quality and range of access to content
  • Semantic analysis and search applications will enable new means of discovery as they evolve and are applied to a range of corpora
  • This is very challenging work from the sociological point of view - that is, in terms of securing agreements across multiple communities. 
  • This kind of work (although not this kind of approach) has been the effort of a number of other well-funded consortial efforts --- why have they not succeeded?
  • Too much work may be required of a scholar to represent her holdings in a form she is willing to share
  • Significant quantities of inaccurate metadata contributed to an aggregate index may weaken the value of the index as a whole (similar to criticisms that have been made by scholars of bibliographic metadata associated with GoogleBooks indices)
  • This deliverable implies assumption of responsibility to preserve aggregate indices, and may imply a need to preserve cached copies of metadata and/or collections in the form published by scholars, over the long term; cost and sustainability issues apply

2.3 Platform Deliverables

2.3.1 How Bamboo Builds Services

2.3.1.1 Building Blocks for Scholarly Tools

Services that deliver broadly-required software functionality across tools and disciplinary domains, providing a foundation on which value-adding technology can be developed without reinvention of core capabilities.

Primary Audience: Tool Developers; Faculty who are Tool Developers; IT Organizations; Digital Library and Other Content Management Projects

Year One Development: In its first year of implementation work, Bamboo will design and implement foundational services to support its Product Deliverables, with the expectation that these will be subjected to 'field tests' and, subsequent to the first year, be refined for release as "building blocks" that deliver core or common functionality that can be easily incorporated in real-time - independent of a technologist's development platform of choice - rather than be reinvented for each particularly-focused tool or project. Stateless "building block" services will include authentication services that integrate with extant institutional or multi-institutional identity management systems; group and role management across application contexts; "passthrough" storage services that maintain information in databases and on file systems beyond the Bamboo Services Platform; and analytical services (e.g., textual analyses, prosopographic analysis, multimedia comparison, etc.) that return a result set in response to input of content to be analyzed. Stateful functionality in a services model, that delivers a high level of value in proportion to a bounded volume of data maintained on the host platform, could be offered similarly without incurring unsustainable storage costs. Services such as community-curated ontologies that can be applied across applications and content repositories might fit this model. It is noteworthy that Bamboo proposes to deliver core functionality in the services paradigm rather than as code libraries, or, more generally, as abstract frameworks that are implemented as code libraries in a number of languages.

Further Development: The development, release, and deployment of "building block" services will enable technologists to more efficiently provide functionality of direct, often specialized value to scholars. Developers' confidence in a model of consumed services to replace code developed and deployed in local-application context depends on confidence in the service provider's reliability and longevity. Recognizing this, Bamboo acknowledges that adoption of such "building block" services is not an achievable first-year goal. Release of "building block" services for broad uptake (v1.0 and forward) is therefore not anticipated until after earlier candidates are subjected to some degree of 'field testing' and consequent evolution. Shibboleth, SEASR, ARTFL, Perseus, Concur, and Berkeley Prosopography Services are projects and partners whose technology might be incorporated into "building blocks" packaged for low-barrier consumption by software developers focused on humanities scholarship. Bamboo intends to partner with these and/or other leaders in provision of digital technology for scholarship to first provide services in support of Bamboo's first year Product Deliverables; and then to evolve "building blocks" for more general uptake in future Bamboo Implementation cycles.

Benefits

Costs/Ris

  • Reusability: Enables tool makers to focus on the unique and not recreate the generic
  • Reusability: Enables the same service to be used by multiple parties; not paying for the same thing twice
  • Reusability: Allows standards of interoperability to be developed and quantity of standards to be reduced by reducing the number of different/incompatible parts
  • Evolution: Allows services to be easily added or replaced because SLAs are documented and consumers are registered 
  • Relieve technology partners of the burden for maintenance of refactored services, at the same time the service functionality is made more robust
  • Developing a critical mass of 'building blocks' will take time. Managing expectations of delivered, measurable value will be a challenge
  • Central points of service delivery risk failures that have broad effect -- from simple down time to serious breaches of security (where AuthN/AuthZ are centralized)
  • Central points of service delivery risk attracting Denial of Service and other attacks
  • Where service offerings are stateful, Bamboo risks sustainability challenges vis-a-vis storage and backup
  • Assuming maintenance and availability responsibility for services on which a range of tools/platforms depend may present sustainability challenges

2.3.1.2 Software Developer Kit & Shared Services Lifecycle

A packaged, easy-to-install set of open-source tools to align development of services for the Bamboo Services Platform; and a set of best practices, principles, and metrics to guide engineering of robust, scalable, maintainable service designs and implementations.

Primary Audience: Developers, Information Technology Architects

Secondary Audience: Information Technology Managers

Year One Development: Bamboo will articulate an initial Shared Services Lifecycle, applicable to services in-scope for its first year deliverables, to define qualities and best engineering practice through which software evolves as it is transformed to enterprise-class services architectural standards and rendered capable of running on the Bamboo Services Platform. A development infrastructure - an SDK (software development kit) oriented to building services for deployment on the Bamboo Services Platform - will evolve through the first months of the project; and will include the Bamboo Services Platform itself, deployed in a development-oriented configuration (e.g., running on a developer's desktop or in a locally hosted virtual machine). Time and care taken to regularize and simplify a standard set of development tools with which Bamboo Partners will realize services will facilitate onboarding of contributors, who will thereby not need to bear the full burden of locating, downloading, and configuring a scatter of development tools in addition to a local deployment environment; but instead can ramp up development activity in short order.

Future Development: As the Bamboo SDK reaches a usefully stable state, it will be configured and packaged for simplified download and well-documented, appropriately automated set up by a broader community of developers who are getting started, upgrading, or 'normalizing' their environments for Bamboo-compatible service development. Bamboo's Shared Services Lifecycle will similarly broaden, deepen, and stabilize as experience and a diverse set of service development efforts inform its evolution.

Benefits

Costs/Risks

  • Allows more time to be spent on productive work by reducing time it takes a technologist to set up and maintain her development environment
  • Adoption inherently promotes standards, policies, principles, and practices that underlie Bamboo's Shared Services Lifecycle and encourage well-engineered, maintainable code
  • Provides and encourages the use of testing frameworks for greater test-coverage of service code
  • Developers may see this deliverable as an unwelcome attempt to constrain tool and platform choice
  • It will be difficult to generate significant developer interest in adopting a platform and development environment until Bamboo demonstrates traction in reliably delivering services of value to scholarship that are broadly consumed.
  • It takes significant time to define and stabilize a development and deployment environment, risking the possibility that Bamboo will fail to deliver a stable, packaged SDK in its first year

2.3.2 The Bamboo Services Platform

2.3.2.1 Platform Architecture

A conceptual map of the technology through which Bamboo's support for Arts & Humanities scholarship will be realized, in layers that correspond to (bottom to top): technology infrastructure; core and utility services; applications and resources; and user-interfaces and integrations.

Primary Audience: Information Technology Architects, Developers

Secondary Audience: CIOs, Agencies, Information Technology Managers

BambooArchitecture_TechProposal

Service Consumers (UI and Integration Clients) - Human access to Bamboo applications and services may be made available via browser-based interfaces built (or configured) and maintained by Bamboo; through institutional or discipline maintained system/platform interfaces; and/or through social networking platforms via plug-ins (a.k.a. "widgets" or "gadgets"). Tools, services, virtual research environments, and other platforms whose sponsors/developers do not have an established Services Partnership relationship with Bamboo may also integrate through publicly accessible services at the Application / Task / Resource layer.

UI / Integration Interface Layer - Services that transform or decorate information delivered by Bamboo application/task/resource services, to meet presentation or integration requirements. E.g., user profile characteristics might be analyzed to determine and set a default UI configuration; or, a group of systems/platforms that integrate with Bamboo application services may require transformation of information (e.g., alignment of disciplinary references to a vocabulary standardized for the U.K.), where that transformation could be more efficiently performed on the Bamboo Services Platform rather than require each consumer to (redundantly) effect the same transformation locally.

Application Services - Functionality added to compositions of task, resource, core, and utility services specific to an application or presentation. E.g., Bamboo Federated Content Catalog services might generate, maintain, and expose indices of materials "crawled" by services that implement or proxy generic ingest capabilities, and be constrained by generic access-control services.

Task Services - Functionality added to compositions of resource, core, and utility services to bundle a complex set of capabilities of sufficient interest/uptake to be delivered as a 'finished' composition rather than be pieced together by a higher-level/external service consumer from lower level services. E.g., an image analysis service might extract and/or derive metadata from an input file using multiple core or utility level image analysis services, and return a union of the results in any of a variety of specified formats.

Resource Services - Services that expose digital content. E.g., collections of texts, images, audio, and/or video, federated search result sets, results of analysis run over an input or referenced data set, etc.

Content, Application, and Task Proxy Services - Passthrough (wrapper) services to functionality that is deployed & maintained by others. E.g., content access & textual analysis services that are offered on a platform operated by the HathiTrust, and to which Bamboo Members and Partners are permitted access (governed by subscription agreements) via passthrough services operated on the Bamboo Services Platform.

Resource/Task/Application/Proxy to Bamboo Core - Interface Layer - Services that analyze & queue requests, potentially forking resource-intensive requests into parallel processes over multiple BSP instances.

Core Services - E.g., document management, citation, annotation, text|multimedia analysis, deep and surface content & metadata harvest, social interaction

Core Proxy Services - Passthrough (wrapper) services to core functionality deployed & maintained by others

Utility Services - E.g., authentication, storage, transformation, OCR, usage tracking, permanent-URI management

Utility Proxy Services - Passthrough (wrapper) services to utility function deployed & maintained by others

Bamboo Services Deployment Stack - Language, Framework, Libraries. E.g., service container, service implementation libraries, authentication framework, logging framework, message mediation, orchestration, rules engine. The Progress FUSE stack (composed from Apache-centric SOA infrastructure distributions) is a candidate set of technologies that is expected to play a prominent role at this level of the Bamboo Services Platform.

Infrastructure - Hardware/VM, Operating System, Storage Platforms (ECM, relational database, RDF store), SAN, network

2.3.2.2 Platform Technology

The stack of infrastructural technologies on which Bamboo services will run is composed of Java-based service-oriented infrastructure technology that supports RESTful and WS-* based services; is developed and maintained by an active, open-source community; and is packaged and tested for release by a well-respected vendor of (optional) enterprise infrastructure support.

Primary Audience: Information Technology Architects, Developers

Secondary Audience: CIOs, Agencies, Information Technology Managers

Bamboo intends to adopt the following technology selections as core elements of its Services Platform stack:

Bamboo anticipates that Eclipse will be the integrated development environment (IDE) of choice, due to its deep support and its breadth of pluggable tooling, including language support beyond Java (e.g., PHP, Rails). Eclipse is built on the OSGi framework specification. While the FUSE Integration Designer currently requires a support subscription to use, one advantage of utilizing Eclipse is the opportunity to leverage Progress FUSE tooling for the technology stack described above.

Benefits

Costs/Risks

  • The Apache community (on which the FUSE stack is founded) is an active, vibrant open-source community on which many have successfully relied.
  • The FUSE packaging of Apache SOA projects is "productized" as stable releases to which extensive quality assurance tests have been applied to better guarantee proper interoperability of service delivery infrastructure under conditions and deployment scenarios relevant to Bamboo's needs.
  • The FUSE team "employs many of the Apache committers, including [...] the project chairs." Support contracts available from the FUSE team through Progress Software offer a layer of enterprise support to deployers of the Bamboo Services Platform who require it, melding the vendor-independence advantages of open-source solutions with levels of assurance required by Information Technology organizations charged with maintaining high-availability services for the institutions they serve.
  • The FUSE stack supports both RESTful and WS-* compliant web services.
  • The Eclipse IDE is characterized by deep support, broad uptake, and breadth of pluggable tooling, including language support beyond Java (e.g., PHP, Rails).
  • As of the date of this document's publication, there has been little discussion among prospective Bamboo Partners and Members of technology stack options and choices.
  • While RESTful and WS-* style web services are supported by the FUSE platform, some who are inclined toward a pure-REST model of web service delivery strongly prefer a LAMP stack (Linux + Apache httpd + MySQL + PHP).
  • ServiceMix 4.x and FUSE ESB 4.x are still in the "incubation" stage of their product lifecycle as of the date of this document's publication. Their selection as OSGi-supporting containers is predicated on the expectation that they will be production ready by the time Bamboo is prepared to deploy in a production mode.

2.3.3 Delivering Bamboo Services

2.3.3.1 Platform Deployment

The Bamboo Services Platform and the services available on it will initially be made available to service consumers through centrally-hosted service providers, and be packaged for distributed developer-oriented deployments in Year One of Bamboo Implementation; then in subsequent implementation cycles will be packaged for simplified, automated production deployments at local institutions.

Primary Audience: Information Technology Architects, System Administrators, Developers, Information Technology Managers

Secondary Audience: CIOs, Agencies

To prioritize service delivery over deployment complexity in Year One, Bamboo proposes to initially deploy Integration, Quality Assurance (QA), and Production instances of its Services Platform centrally, in one or a small number of data centers. This will permit focus on stabilizing the platform itself and enriching the set of services delivered on it before turning to questions of supporting platform instances hosted in a widely-distributed model, including synchronization of data across hosting organizations that may wish to restrict certain data to no or limited synchronization. In subsequent implementation cycles, Bamboo will package its Services Platform for distributed (local) hosting; and will include synchronization services that permit identification of target synchronization hosts and appropriately granular selection of data to be synchronized.

Initial investigation of benefits, issues, problems, and risks associated with distributed deployment and synchronization across Bamboo Services Platform instances will be undertaken as a natural corollary to the "Software Development Kit..." Platform Deliverable described above. The development ecosystem for technologists implementing Bamboo services includes Developer-layer Bamboo Services Platform instances on distributed developer desktops and/or developer-owned virtual machines. These instances will serve as initial venues for investigating value and cost of, as well as developing strategies for, distributed hosting at the Integration, QA, and Production layers.

Because Bamboo's effort and products will be developed as community-source software and packaging of open-source infrastructure, those institutions who wish to pioneer distributed deployments of the Platform in a (largely) self-supported mode at the Integration, QA, and Production layers will be able to do so.

3. Adoption

Central to the vision of Bamboo is enabling scholarly innovation through the use of shared technologies and technology services. To achieve this, Bamboo must provide multiple avenues to create, use, and share services, resources, tools, content, and data all within a framework that preserves and encourages academic diversity balanced against technical interoperability and sustainability. Therefore, the adoption strategy that underpins the "Bamboo Program: Phase I" and ultimately this initial Bamboo Technology Proposal is one that is multi-faceted with elements focused on institutional adoption, individual projects, and communities of practice.

3.1 Technology Adoption

Although we envision a number of ways to incorporate the services and resources of Bamboo in arts and humanities research and pedagogy, initial adoption of Bamboo will be aligned around the three classes of deliverables described in section 2.

In each of the three categories, there will be metrics associated with adoption; minimal targets will be determined through discussions with the Bamboo Partners and Members, and will be reflected in the final draft of the Bamboo Technology Proposal.

3.1.1 Product Deliverables

The adoption of Enabling Technologies to Support Communities of Practice will be measured by:

  • The number of institutions, individuals, and disciplinary societies that have (minimally) initiated the use of Bamboo's enabling technologies to build an on-line community of practice; and/or who have contributed to the Bamboo Commons.
  • The number of institutions and/or disciplinary societies that have (minimally) initiated the use of Bamboo's enabling technologies to integrate funcitonality into a local tool, environment, platform, or website.
  • Momentum, determined by trackback mechanisms, in the growth of citations from formal publications, drafts, discussions, on-line collaborations, and other web site content to content published by communities of practice and/or the Bamboo Commons.

The adoption of Humanities Corpora and Curation Workspace will be measured by:

  • The number of small collections that have (minimally) initiated migration to management using a Humanities Corpora and Curation Workspace.
  • The number of institutions that have committed to providing some form of preferential support to collection-owners for corpora management and curation using a Humanities Corpora and Curation Workspace.
  • The number of endorsements of Bamboo's Humanities Corpora and Curation Workspace from disciplinary and other scholarly organizations to their membership.
  • Momentum, determined by trackback mechanisms, in the growth of citations from formal publications, drafts, discussions, on-line collaborations, and other web site content to content managed using Bamboo's Humanities Corpora and Curation Workspace.

3.1.2 Core Services Deliverables

The adoption of Core Service Deliverables will be measured by:

  • The number of individuals who have established a Scholarly Profile using Bamboo's Scholarly Profile services.
  • The reach of integration services deployed to harvest profile data from previously existing sources, such as faculty web pages, on-line CVs, Zotero bibliographies, and institutional faculty biographical tracking systems (i.e., the number or percentage of faculty whose previously published profile information can be harvested by Bamboo integration services).
  • The reach of integration services deployed to harvest metadata from collections managed on platforms that conform to metadata harvesting standards supported by Bamboo's integration services (e.g., the number, size, and significance of collections and collection objects whose metadata can be harvested by Bamboo integration services).
  • The extent of collections whose metadata and or digitized content is made available to Bamboo services that aggregate index data to enable deep search.

3.1.3 Platform Deliverables

The adoption of Platform Deliverables will be measured by:

  • The number of projects and/or organizations that have established an intention or plan to integrate Bamboo's building block services into future development projects.
  • The number of citations or endorsements of the Bamboo Services Lifecycle by institutions and organizations establishing local guidelines for service development.
  • The number of citations or endorsements of Bamboo's software development kit (SDK) by institutions and organizations establishing development tools and environments for service development.
  • The number of institutions and organizations that have begun to or have established plans to locally deploy the Bamboo Services Platform.

3.2 Program Adoption

To sustain the efforts of Bamboo, it is essential that the community of adoption extends beyond those institutions and organizations that participate in the Bamboo Technology Proposal. More importantly, the community that forms around Bamboo cannot merely focus on technolgy; adoption must be linked to research, scholarship and pedagogy. Therefore, the adoption of the Bamboo Program will be measured by:

  • Growth in the number of institutions and organizations actively seeking to become, or having become, Bamboo Partners or Members and contribute to the evolution of Bamboo.
  • Appearance by the end of the first three-year phase of development of cited Bamboo resources, tools, services, and/or capabilities in scholarly publications and activities.
  • The number of endorsements of Bamboo's goals, strategies, and/or delivered technology by individual scholars, projects, institions, and organizations.
  • No labels

1 Comment

  1. Unknown User (jim.muehlenberg@doit.wisc.edu)

    Hello - in general, I must admit I found this document a bit difficult to parse and digest.  The style seems rather dense and carefully-worded, and I found I needed to reread sections several times to get the drift of the material.  While I'm not a geek, I am an information technology manager, so if I'm having some trouble digesting this, I am concerned that others (faculty, for example) will find this daunting to take on.  But perhaps the audience is mostly geek types in which case that's OK.

    More of a concern than the style or readability is what I consider to be the somewhat abrupt departure from what has come before.  I perceive that our participative, community-based planning process has generally led in fairly clear steps to where we arrived, together, in the latest versions of the Bamboo Program Document and the Bamboo Implementation Proposal.  New drafts built on prior work in a sequence that the community could follow.  However, it seems this document is rather discontinuous, in parts, from what has come before, and it was not clear to me how we got here from there.  It appears to me that some elements of this technology proposal were inserted here without apparent grounding in the community planning process which came before, and that is a concern.

    My other general comment is that, taken in isolation, this document only highlights parts of the larger near-term (say 1-3 year) Bamboo program, obviously the technology parts.  While that is the purpose of this document, it seems the Bamboo community would still require and benefit from the larger context of the Bamboo Program to judge this document; until that work resumes, it seems at best we can give tentative assessment and commitment to the present document.

    A few specific comments follow.

    1. Introduction, 3rd paragraph - it is not clear from the wording here whether a Bamboo Commons would actually be a deliverable.

    2.1.1 Enabling Technologies to Support Communities of Practice - same question, is there to be a Bamboo Commons deliverable, or just the underlying technologies that could lead to a Commons?

    Benefits box (here and other places) - does this mean benefits of Year One only?  Or benefits of Year One plus Further Development?  (Same questions apply to Costs / Risks boxes.)

    2.1.2 Humanities Corpora and Curation Workspace - where did this come from in our prior planning process?  I don't question the idea per se, but don't see the prior work that led to this in the proposal.  Also, what is the relationship of this work (and CollectionSpace) to something like Omeka?

    2.2.1 Scholarly Profiles - what is BioSphere?

    2.2.2 Tools and Content Connectors - is this a different direction than past thinking about Tools and Content Partnerships?  Maybe it is just a more technically focused description, but it seemed a bit of a different direction than past planning.

    2.2.3 Deep Search and Collections Interoperability - similar question to 2.2.2 above.