Skip to end of metadata
Go to start of metadata
How APIs fit into landscape of research computing on campus
GeoLunch - every Thursday at 1 PM, a lot of web-based mapping applications

Global change biology

Michelle Koo and Kevin Koy present the Berkeley Initiative for Global Change Biology (BIGCB):

Natural history museums + field stations associated with Berkeley campus
Bringing together biodiversity resources
Species, specimen occurrences, other biodiversity information
Lots of orphaned collections, plus well-curated active research collections
Varying amounts of digitization-- part of funding goes towards digitization, data cleaning, etc.
Calbug -- Essig Museum Collections
Less than 1% initially digitized
Turned to citizen science/crowdsourcing-- digitized images of all specimens, labels on side were pinned stacked up on top of insect (had to carefully remove pins and labels)
Associated with Zooniverse project
Some info is controlled vocabulary, some is free-form (read the labels)
Transcribed multiple times, will be interesting to post-process that
Field station records-- measurements + climate data (temp, precip, snow, CO2)
Early century soil samples -- orphaned collections
Data entry by paid and undergrad apprentices
Linked w/ GIS layers (vector and raster)-- be able to have comprehensive ways of tackling questions about global change
Berkeley Initiative for Global Change Biology
Ancient climate modeling; downscaling to relevant biological scales and useful GIS layers
Reunite parts of Wieslander's Vegetation Survey (hand-annotated vegetation maps)
Digitization, turn into GIS layers
Photos w/ associated maps-- spent last year having students add the metadata
All this data goes into Event Database, Baselayer Database; these connect to data engine, which connects to data API
Internally and (in the future) externally developed apps
Also, expert analysis tools + R statistics (ROpenSci) - Django REST framework (GET GeoJSON, XML, map)
Species checklists for parks, sensor database w/ temperature and precipitation records, 3 million geolocated specimen records
Taxonomy tree
Before/after shots (re-shooting the same old pictures)
Running on Heroku free app (can be slow)
Want to show off external applications
Want to make code of examples directly accessible (JavaScript libraries, other scripts, ArcGIS applications, etc.)

Cool Climate Network

Chris Jones - energy research group
Target reduction for greenhouse gases
Early action items for meeting targets
Small businesses, households -- get info to individuals to change behavior
Challenging to find other labs working on similar things
APIs provide an opportunity to build on others' work
Calculators for households, businesses, local governments, schools
Put in minimal info, start exploring data
Indirect emissions are larger than the direct emissions they think about (electricity, cars, etc.)
Berkeley carbon footprint is lower than average; electricity - 5%, much less than typical household (no cooling needed)
13 tons/student just for education (campus emissions)
Population density: green centers of cities, red suburbs
Separate maps for every individual layer (e.g. electricity)
340 input/output variables; can change defaults
800 separate utilities -- how much coal, natural gas, etc.
Extensive "take action" page
Combined w/ Moves API to calculate emissions based on different forms of transit
API management platform --
Layered approach to access
Anyone with a CalNet ID can use it, or other users can create a login
OroEco is using the API; tracking personal values vs everyday lifestyle (starting with carbon footprint)
LandCarbon - USGS; integrating vs different scenarios
For exposure, makes sense to put API somewhere people are already looking
IP issues around data and infrastructure (both)-- if campus provides support in sorting this out, would be valuable
Funders concerned about this
Relying on primary data providers (e.g. different natural history museums, which may have different concerns)
Is "UC Regents" the right stamp for things that are being shared globally? Maybe adopt CC license? Roll out our own?
Risk of people creating something "shinier" than you with your own data
Currently aggregating a lot of data, vs. real-time mashups
Real-time vs persistence (smaller providers -- persistence is a risk)
Only when collections come into contact w/ big aggregator do they realize how much work they have to do
  • No labels