Joyce, Michael, Susan and Chris met to discuss Essig's requirements and CollectionSpace.
Chris stepped through the notes he wrote describing functionality in the current Essig system.
Customization of schema: Susan and Michael will be working on this for the version 0.3 release using NAGPRA code supported by a controlled vocabulary. They aren't sure what this will look like. Work on the application layer and UI layer around schema customization has not started. This is work at the service layer and the database layer.
Validation: It is very important that the Essig system have field-level validation. Some examples include data entry of latitude and longitude values. The current Essig system checks that values are in a required range and in one of several formats (e.g., minute-degree-seconds). It will automatically generate the decimal values if the correct format is entered. Will CollectionSpace have this kind of validation and conversion? When? Essig needs a web system that enforces a number of rules so the person doing data entry does not enter an unacceptable value.
System management: Who will manage and support the various pieces of the application. Will one group be able to manage the system?
Will the system perform well and be reliable? When will we have some evidence of this?
Data migration: You can view data in MySQL tools, but it isn't very pretty. Nuxeo is making tables from schema definitions. You can make REST calls directly to the system from a browser. Controlled vocabulary tables might be handled differently from other Nuxeo-managed tables. We can not see data migrated for PAHMA migrated in right now because the project team is in the middle of changing the data load to match the new schema. Next week, Richard and Aron will be doing some work to prepare for the new PAHMA data (including the NAGPRA code custom field). Then Susan will be able to load data. We are not yet sure when that will be visible to others.
The first data load was very limited, consisting of 5 fields for 10 object records.
The second data load consisted of 50 accessions and 100 objects tied to those accessions.
The next data load will include object records and accessions (number of fields and number of records to be determined) including multi-value attributes (for attributes that were treated as single values in the earlier loads) and controlled vocabularies (simple, flat list in this load).
We do need to test at scale. Data loads are fast now, but how will they work when we have 100K records?