Notes regarding conversations with members of the BNHM community following the release of the initial results.
Essig: Gordon Nishida
October 26, 2009
- Gordon has looked at the scorecard and materials and notes: "The scorecard seems biased towards the data operations and does not appear to represent the data applications or functions as well."
- He did not see any mention of some of the data management tasks that he is concerned with: managing controlled vocabularies, cleaning up data, extracting data, and so on. (Clarify definitions in several places to include these tasks.)
- Understands why CollectionSpace was scored highly but wants to see how the new system will work for Natural History collections. He sees CollectionSpace as presented now as primarily a cultural heritage solution.
- Seeing real biological collections data in CollectionSpace will help.
- Understands the goal of a single, customizable platform, but is concerned that each museum will need significant and continuing customization and maintenance.
- Gordon wonders why the CollectionSpace developers decided to focus on a single collection rather than selecting a function that is common through all collections such as loans, or accessions and the like.
UCMP: Pat Holroyd, Diane Erwin, Ken Finger
October 26, 2009
- Pat, Diane, and Ken have looked at the scorecard and materials. Pat in particular went through the scorecard in some detail, summarizing the data, looking at calculations and so on!
- While they understand that the business criteria and technology criteria are important, they believe that they have no control over these aspects and not much to contribute. Therefore because the functional criteria are only worth 40%, their concerns are statistically deflated. Ultimately they think this will be more of an administrative decision (What can we afford? What will we get for what we can afford?), but they appreciate the opportunity to provide feedback. We talked about some of the overarching principles: that core functionality has to be in place and that migrations to a new platform will take time (and will start with museums who are in more pain and are ready to get to work sooner).
- Funding and staff time are a major concern, and they expect this to get significantly worse.
- We talked about CollectionSpace and the webinars. They would like to see real data in CollectionSpace. They would be willing to watch a webinar though their machines not support the applet. We could hold a popcorn webinar (either at the time of the 3rd webinar or after the recording is available on the web site).
- UCMP, like other museums, has a wide range of other materials that could (and should) be pulled together in a CollectionSpace-like environment: not only field notes, but films, photographs, and other ephemera from the history of UCMP. These relate back to specimens and research projects.
- Across the University of California system, there are other paleontology collections. Have we talked to UCLA or Riverside?
- Across campus, there are many other collections besides the official museum ones. There are teaching collections, and many faculty have their own research collections. There is a paleobotanical teaching collections that is used at multiple UC campuses. One of the scientists working on the Moorea project has brought back many, many fish specimens stored in liquid and keeps those in his office because there is no other place to store them. Another professor has a large and important collection of casts. These collections should be a significant opportunity both to make the system more attractive and to protect more collections. A new collection management system should be used by anyone with collections -- individual faculty, research groups, and whole museums. Great idea!
Herbaria: Dick Moe
October 27, 2009
- Dick has looked at the scorecard and materials.
- We talked through the process and implications. He understands that the business criteria and technology criteria need to be weighted very high. You can deploy a great system, but if it will not be supported in a year, then you have a big problem.
- Specify is a known solution right now and is deployable.
- Talked about the assumptions behind the CollectionSpace scoring (see Initial Scoring Notes).
- Data management tasks are very important. Dick says: I spend time extracting data from Smasch for several online-applications. Smasch, after all, predates the Web. All of the things that we have on the Web that use Smasch data require an extract. "The CCH (Consortia of California Herbaria) extract isn't special. Very likely, however, the CCH data will still require some sort of an extract in the future, since the CCH is an aggregation of disparate databases forced into a common data format. I spend about 5 minutes per week running the CCH extraction script and transferring the tarred result and probably less time cumulatively on other extracts. I worry less about expense of time than about long-term sustainability, flexibility, and improvement."
- We talked a lot about the deployments of a new system. How much museum staff time would be needed for a deployment? How can you ensure that data migration does not introduce errors into a new system?
- The Herbaria have not tried to use their current system, SMaSCH, in unintended ways, so hopefully the data are pretty clean. However, Dick has done some bulk-loading and knows that some of the XDB-based processes therefore did not fire off. As a result there are some unexpected values in some fields. This is the drawback to bulk-loading vs. passing data record-by-record through an API or service layer.
- Dick asked if CollectionSpace is primarily focused on data entry, data management, or data extraction and reporting. Chris responded that right now the main emphasis is on data entry (of transactions) though there is increasing effort right now around some data management tasks (controlled vocabularies, user/group/permissions management). Regarding data extraction and reporting, the goal is to integrate an open source tool (like BIRT), but we do need to start pushing that agenda with the CollectionSpace project team.
- In general, research system integration is an important topic and also needs to be emphasized to the CollectionSpace project team.
IST: Joyce Gross
October 28, 2009
- Joyce has looked at the scorecard and materials and has talked with Chris at several standing meetings.
- Joyce attended the first CollectionSpace webinar. She would have liked to see a demo of the system and was disappointed that it was all PowerPoint slides.
- Joyce asked about sustainability. How many developers are working on CollectionSpace currently, and how many have funding beyond June 2010? What is the likelihood that future funding will be obtained?
- Joyce would like to see what the customization looks like and what the configuration files look like.
- Joyce wants to see CollectionSpace with natural science data.
- For the PAHMA experiments, she asked if migrated data can be seen yet through a user interface.
- For the PAHMA data that have been loaded but are not visible through a user interface, can anyone make a REST-based service call to see the data?
- Joyce: My question about future funding is not only about sustainability but about funding to deploy CollectionSpace in all the natural history museums, since it doesn't look as if the UCB natural history museums will be moved to CollectionSpace by June 2010 (except maybe Hearst, which is quite different from the other natural history museums). Deployment is more time consuming than maintaining, especially if it's a completely new system -- it's a time when bugs and basic overlooked feature requirements will be discovered -- ie, bugs that can make a difference on whether the system is usable or unusable. This is why it would be really useful if museums could look at examples of working databases right now -- ie, if museum scientists could look at real data in CollectionSpace, and if technical people could look at real configuration files, modify some configuration files, look at how configuration file changes effect the UI, etc. This would be really helpful for the evaluation process.
Botanical Garden: Holly Forbes
October 28, 2009
- Holly and Paul Licht had looked at the scorecard and materials.
- We discussed the scorecard and process, noting that none of the three systems being evaluated is currently in use for a botanical garden, i.e., for living collections.
- Arctos and Specify were given a .25 for the "propagation" functional criteria because both have claimed they could create a partial solution for botanical gardens without major reengineering.
- Propagated plants as versions of existing specimens: There are some similarities to creation of derivatives or parts from existing objects. Lam is starting up the CollectionSpace/SAGE design analysis soon so we will have an opportunity to look at this more closely.
- Terminology: "Conservation" in botany refers to efforts to save or conserve species, as well as conserve living specimens.
- Data sharing is important for the Botanical Garden.
- Reporting is an important concern for the Botanical Garden. Currently they use the IST-managed Business Objects service. This needs to be more clear in the CollectionSpace project plan.
- Holly also asked about trigger actions that are important in their current system. Can these be customized per institution? Example: One accessioned plant might become multiple planted specimens (via propagation) over time. Though each individual planted specimen has a unique ID in the system, Bot Garden staff rely on the combination of Accession ID and Garden Location to create a logical unique key. Currently, when inviduals (Accession ID/Garden Location) die out, they are marked dead. A trigger in the database then looks to see if there are still living versions of that accession alive elsewhere in the Garden. If all other specimens derived from that accession are dead, then the accession record itself is also marked dead. The propagation module has many triggers too. (We need to create some use cases on this and talk with the CollectionSpace project team.)
- SAGE has an accession history table which acts like an audit table. They use it to see where a planted accession used to be located and to investigate data problems.
- How do Garden staff check data while they are working in the garden? On a very occasional basis, Holly prints out a large (800-page) report of all specimens and locations. There have been discussions in the past where some sort of "field computer" could be used by hort staff which curatorial staff could then upload, quality check, and enter into the record system.
- The Botanical Garden has thought about education-based systems that would run on PDA's for tours and such. Wireless and cell phone can be very limited however.
- Making a decision about the platform for BNHM-IST collections wil be a very good thing!
IST: John Deck
November 2, 2009
- John sees the collections evaluation as being weighted towards CollectionSpace. It's hard to evaluate a system that is not in use yet. It would be good to see a demo with real data.
- BNHM systems rely on ongoing tweaking and research integration. Future systems need to support this.
- Platform selection also needs to take into account the skillset of existing team of research programmers and legacy codebase so that they can more easily extend the system. Scripting languages (e.g., Perl and PHP) are still powerful tools for adding new capabilities.
- We do need to coalesce systems within BNHM. (We talked about the mandate from campus to provide solutions beyond the BNHM consortium.)
UC and Jepson Herbaria: Andrew Doran
November 3, 2009
- Discussed the process and scorecard.
- Andrew following Specify announcements and development. Version 6.1 was just released and has useful enhancements (e.g., IPT for web access to collections data).
- Andrew exploring Herbaria@Home for Charterhouse School Herbarium collection. Looks like a useful and user-friendly community cataloguing site with a simple data model. A similar system could be used to remotely catalog remaining California Herbaria from images, maintained by the Herbaria but data entry by volunteers: (http://herbariaunited.org/atHome/)
- UCJeps needs to replace current SMaSCH system as soon as possible. It is creating risks for them especially related to supporting research projects. Technology is very limited (e.g., not web-based).
- Andrew is concerned about funding and support interruption at Specify. However, he is also concerned that when applying for NSF funds, NSF reveiwers will favor Specify deployments.
- Replacement system should really be web-based. Specify's IPT module is only a partial solution.
- Integration and interoperability are also very important for the Herbaria. Right now they are working on an archives system and have submitted a grant proposal to implement an off-the-shelf library system. Ideally their collections, archives, and library would be integrated, but if they have to stand on their own for awhile, that is acceptable.
- CollectionSpace looks promising from an integration perspective, but when will this be available?
- In addition to the archives and library projects, UCJeps is actively pursuing grants and other projects that push the limits of collections management: seaweed collection (e.g., depth of collection in different units/automatic conversion, etc.), locality data sharing with other herbaria, historical collections with old locations in the UK; the ability to maintain separate data sets so as not to contaminate new projects with misleading legacy data but also simple batch processing tools to clean up legacy data; inferred collectors (based on archived correspondence), type specimen project with other herbaria, and so on. Mistletoe collection coming in soon is very dependent on permits for transport.
- Business rules and triggers. E.g., freezing of mistletoe collection needs tracking and could rely on batch update of condition for multiple barcoded specimens. Will CollectionSpace facilitate this kind of operation? Right now, Dick Moe figures out the SQL to do the update.
- Loans: Would be useful to be able to facilitate contacting loanees who have type specimens beyond the end date of the loan.
- Sheets and books and flexible fields: Herbaria collections stored on sheets in bound books, sometimes multiple specimens per page. Also, one special set of specimens might exist on a subset of pages in one book. Andrew would like to be able to add a field as needed to track these things. It is too hard to go back after the fact to pull out these kinds of data.
- Timing is an issue. UCJeps will hear back about seaweek grant in February or March and wants to start entering metadata as soon as possible.
- Andrew is happy to help with data issues, data migration, testing and so on. He wants to make sure that migration to a new platform is managed well to help ensure a smooth transition in the Herbaria.
- Andrew is not an opponent or proponent of any one system. Functionality is important, but so is timeframe.
MVZ: Carla Cicero
November 3, 2009
- Carla and Michelle Koo looked closely at the scorecard and sent back some comments and questions. Chris and Carla talked through most of those, and Chris will update the scorecard accordingly. Some of the requests were for clarifications on scoring logic. Also, Carla provided helpful information on Arctos.
- Disruption of funding for Specify is a major concern.
- Generally, Carla believes we should distinguish between functionality that exists now, functionality that is under development and has a timeline, and functionality that is under consideration. For CollectionSpace, we should make clear what is done now and what is not. For CollectionSpace, wants to see the system with real data too.
- Discussions about scorecard will be addressed in next version and will focus on things like barcode/RFID, event-based cataloging, risk management, geographic coordinates and mapping, observations, genetic data, LIMS integration, bibliographies, controlled vocabularies, taxonomies and taxonomic identification, data export, field notebooks, temporal and spatial data management, date formats, community cataloging. Scoring logic needs to be clearer.
- For systems like these, one really needs to be able to test with real data.
UCMP: Mark Goodwin
November 4, 2009
- Mark wants to know what the museum scientists think about the collections evaluation. We talked about the discussion with Pat, Diane, and Ken from UCMP.
- Loans are an important issue for UCMP collections. We talked about the current plan for loans.
- Regarding databases and technology, Mark knows that these systems are always a work in progress. Despite the many moving targets, you need to be able to protect and share the data in the systems.
- Functionality is most important to Mark, but he understands that the technology and business criteria are important too. The interruption of funding for Specify is a major strike against that system, taking it off the table from Mark's perspective.
- We talked about what happens once a decision is made about a platform, that initial deployments will focus on museums most in need of a replacement system who are also able to work with technical partners on deployments. Museums such as UCMP that are enjoying the benefits of a relatively modern system will probably not be the focus of early deployments. Similarly, MVZ is invested in Arctos and we don't expect them to migrate off of that platform any time soon.
- We also talked about the fact that we have a mandate from campus funders to work on solutions that provide benefit beyond the BNHM Consortium.
- Accessions: UCMP's current system does not really track accessions, which Mark defines as the initial intake of (often or always?) a group of specimens. For example, when he returns from a field season with a collection of specimens, they are collectively given an accession number and name (e.g., "Mark Goodwin's 2008 field season in Montana"). That information is currently entered and managed in a separate system (maybe Excel, maybe dBase). This digital file is somewhere on UCMP servers but is not easy to find. The accession number does get entered into the specimen database when the specimen information is entered.
- CollectionSpace: What is the funding picture after June 2010?
- UCMP is getting a new director in January. We will have to see what his priorities are.