Progress and Demonstrators
- Demonstrators were seen, among other things, as a path to discover threads of interest within the PB (A&H) communities
- Context finder - Ray Larsen
- Irish lit (Open Calais) - prosopography - finding place names and people (by name references) - dynamic links to (whatever - library links, images, bibliographic entries)
- Emma Goldman - lecture tour w/ georeferencing - again w/ context links to related information from the EG Papers Project at Berkeley, including scanned materials from the archives
- Thought Ark - Sorin Matei
- tracking scholarly activity and harvesting metadata from activity happening in real time, storing it in a repository that is shared
- citation tracking, w/ public value increasing because the fact of the citation is recorded, shared, compared/combined with other activity
- ARTFL - Mark Olsen
- matching passages. Shakespeare's "Venus & Adonis" against 11K documents
- sequence alignment
- can be cited reuse, plagiarism, etc.
- share "aligners" (algorithm parameters that determine what matches, which matches are considered significant, etc.)
- BLAST - computational biology - blast.ncbi.nim.nih.gov - HPC
- Kaylea Champion
- Works in progress, propose to much work done among the community
- NYX - Nicen your XML.
- Migrating bibliographic information (created in Word) into XML, for easier repurposing and migration of publication media and platform
- demonstrator includes infrastructure elements (XSLT to transform to HTML, PDF, etc.) and clients (Flash, SEASR, et al.) to consume the result of the transformation
- Prof Kent Hooper: this solves real problems re: transforming TEI marked-up biblio artifacts that aren't solved yet, and that lots of scholars have & would like to transform
SEASR Demonstration, Loretta Auvil
- What is SEASR
- R&D environment for scholars
- MEANDRE: data flow analysis
- RDF to describe components and flows
- SEASR as a way of doing mashups
- Highlights of SEASR functionality
- Zotero as front-end for managing content
- MONK analysis
- A number of other tools "wrapped" into SEASR – e.g., NLP, text transformation, google maps, SIMILE timeline
- Data-intensive flows, e.g., a million books. Ported an analytical task from laptop to supercomputer, and were able to port w/o changing any code because the task was implemented in SEASR/Meandre
- SEASR PB demonstrators
- Export Zotero collection, performed citation analysis (network importance algorithms from JUNG)
- Upload exported RDF into Fedora (PoC re: Fedora connection)
- Entity extraction from OpenNLP - mash identified locations with GoogleMaps (recalling earlier entity-extraction, google maps mashup demonstration)
- Fedora integration allows long-running processes that get wedged to pick up from an interim state rather than re-run from the beginning
- SEASR interface can combine multiple UI components in a single view
- JN: How would folks write their own components? LA: writing a component is "about as easy as writing a Java class" – release this week will make this easier. Also support for python components. Eclipse plugin exists. Workbench works off the browser (GWT) for connecting up components to make flows. Tools to help with component development.
- Bob Mason: What's not automatable in MONK processes? Unsworth: completing (or making rules re: ignoring) missing metadata ... basically, getting a data set in order ... computationally intensive stages are toward the end of an analytical process.
Report: Proposal and Moving Forward
Discuss charter and scope in reference to proposal. How does the proposal outline impact the scope of this direction? Do the priorities of this direction change given the proposal outline? Is anything in this direction getting lost or overlooked given the proposal outline?
3-year plan: Where is the low hanging fruit? What are the most important priorities for each of the three years?
Workplan through Proposal Submission: What are the primary elements of this direction that belong in the proposal? What is the path toward fleshing out the elements?
- Scope and priorities - we know that tools and content partners does cut across a number of elements of the outline
- There's no separate call-out in the outline as there is for services/institutional support
- Still think tools/content partners makes sense as a working group focus
- Want to broaden/adjust the charge
- Focus on identifying/discovery of tools, content, models
- Really , have to worry about aspects of specialized tools that allow access controls, policy to evolve in more complex ways with access control, cross-institutional access control, tool use
- Not just identifying what's out there, but doing gap analysis
- Has to be tied back into stories and scholarly narratives of real scholarly practice
- Identify tools/content that might apply
- Tools = named tools
- Revised charge would say we're an interface between concrete stories of scholarly practice and the shared services PB will enable
- Translate stories into recipes involving specific tools and content, utensils and ingredients
- May call for identifying/describing tools, outlines of APIs that have to be supported,
- Advocating and encouraging development of new tools, gap analysis, and content
- Referring and review of tools and content
- Idea that this has to be an ongoing, iterative process-- not just proposal development, but also project phase
- Going back to scholars: "we understood your narrative this way, is that right?"
See also Joint Working Group Meeting