This wiki space contains archival documentation of Project Bamboo, April 2008 - March 2013.
Uncommonalities: Producing [collecting] Non-Traditional Materials (and Non-Traditional Audiences)
• Problem of evaluation
• Aesthetic of "good enough"
• Non-traditional scholars
• Hyper-linked texts
• Web archives
• Dynamic texts (only snapshots possible)
• Virtual environments
• Data sets
• Query results
• You go do
• We have to do it too much of it now (because of lack of standards)
o Different ways of seeing
• Different taxonomies/typologies
• (Problem of) searching
o Recreation of data
• Nothing need be thrown away
• What's reliable
o Disciplinary thesaurus/map (API)
• Added to Dublin Core
o Intellectual networking
• Traditional networking
o Conferences, seminars
o Institutes, advisors
• New forms
o Mailing lists > new behaviors
o Social networking practices
o Start young > carry these over
o Larger groups
• Contrast production networks w/ academic interests > "temporary organizations"
• Self-interest, look, path up
• Uncommon Practices
o Primary data, not all goes into disc.
• What happened to "unpublished data'?
• Sciences "waste data"
• Sciences - collaborative, complementary data collection + sharing
o Funding fieldwork is difficult < not yet cited/published -> tension
• Fieldwork done "in the world" w/ soft methods
• Collective creation of primary documents
• Requires ethical responsibility + reciprocity
• Requires archives, processing, access
o Requires academic and public constituencies
o Results in varied productions: print, exhibits, media, events
o Humanist doing fieldwork both creator + critic interpreter
o AND a public humanist (folklorist, historian, linguist, ethnomusicologist)
T = task
W = what
• Google + similar (T)
• Googling (T)
• Read (T)
• Search (T)
• Follow cites (T)
• Selective search (1 + A) (Bounded collection) (T)
• Texts - Resource (W)
• Non-textual resource (W)
• People (W)
• Archives-resource (W)
• Tools (W)
• Facilitate, discoverability, exposure
• Finding aids - physical (T)
• Rooting around (T)
• Writing as discovery (T)
• Speaking as discovery (T)
• Conversation (T)
• Play (productive) (T)
• Experimenting (T)
• Mining (T)
• Visualization (T)
• Clustering (T)
• Pattern extraction (T)
• Browse (T)
• Exposure: hypertext links
• Fieldwork (e.g. digs, interviews, oral histories, videos, etc.) (T)
• Discovery of outliers (T)
• Exposure: semantic web
• Natural language processing (T)
• Topic modeling (T)
• Known item search vs. discovery (T)
• Reappraisal (T)
• Direct content search (T)
• Identification (T)
• Association (T)
• Restriction: Access control
• Restriction: Privacy
• Exposure: Translation
• Exposure: OCR
• Exposure: OMR
• Exposure: Digitization
Archiving + data "hygiene"
Hygiene = "keeping"
• The product is the enemy of the good
• But "good" varies according to the collection
• User determines what is good
o ...user-driven datakeeping (wikis)
• Culturally biased
o ... distributed proofreading
o ... correct errors (needs expertise)
o ... add metadata levels (needs expertise)
o ... add value - e.g. name authorities
o ... data interchange - Bamboo TBI?
o ... ease of contribution for scholars
• Archived texts can be corrected
• Versions retained
• Relationship between Bamboo archive and Internet Archive
• Will material to be archived be selected? What gets archived?
• Redefine "archive"
• Non-text archiving requires refreshment often
• Need to support certain specifications over the long term
• Should extremely large collections be digitized - more valuable for search than reading
• Disciplinary + institutional repositories
• Building different interfaces into digital resources based on policy + audience
• Repackaging based on user/community needs
• Creating a good foundation to promote more effective delivery of info
o Architecture that allows indexing, search, discovery
o Flexible structure
• (Make use of existing structures)
• Storytelling/oral tradition (story core, history makers)
• Make it relevant
• Articulate it
• Information visualization
What can we smash?
o Academe & public
o Teaching & learning
o Faculty & staff
• Genres of publication/dissemination
• Promotion & tenure models
• Understanding of faculty/student relationships
• Perceptions of undergrads
• Perceptions of libraries & IT
• Paradigm of print
Recreating past methods (historical, recent)
• What audience - original, scholarly, virtual
• What substances needed
• What cultures can be reclaimed, reanimated?
To what end?
• To understand the past
• To study the past in different disciplines
• Amped up old methods?
• Archive current methods for future studies of the past?
• Capture past practices to reconsider concs and to understand diff's in present?
• To create meta or hyper text with historical distinction
Replicate with difference, repurposing models
Authority & validation
• Peer review
• Performance review
• Community judgment
• (Related to a product)
Non-Individual, vetting process
• Literal - breaking & remaking
• Figurative - questioning established forms
• "Recognizing what becomes invisible"
• Ownership of ideas - changing the culture
• Change is slow, so we need a better balance
Confessing stupidity > feedback loops to improve > accountability
• Learning from mistakes
• "Digital disaster"
• Exploring dead ends
• Final report
• Peer review
• Public response
• Inviting participating by acknowledging boundaries of knowledge
* Understand messy context
* Choose approach
* "Knock head against material"
* Group material
* ID regularities
* ID departures from each norm
* Explain departures
* Reformulate norm
* Publish reformulation
* Engage in debate
* (Start over)
* ID+ challenge categories, hypothesis, meaning
* Uncommon practice
+ Capture, preserve, re-use
+ Provide resources
o "The one thing I need is time"
o Individuals need resources + places +...
o Not everyone needs the same
o Nurturing is necessarily uncommon
* -> Get funders to understand this diversity of type + scale
* Scheduling / Funding
* Write books
o Locating materials
+ Research in situ
+ Obtain access
+ Use search tools (catalogs, Google)
+ Assemble fragments
+ Collecting materials
+ Synthesizing materials
+ Organizing materials
o Visiting places
o Talking to people (colleagues, students, et. al.)
o Teach the subject
* Audit/view visual/audio materials
* Transform formats for access or conservation
* Curate materials
* Preserve materials
* Store/retrieve publications
* Provide tools for access & use of materials to A&H scholars
* Copying in the course of an evolving understanding of materials
o Clear IP/rights issues
Modeling & visualization/sonification
* Making the invisible visible
* Show & tell (other than through text)
* Integrate vs. pre-determined
* Changeable modalities of representation
* Virtual unification of physically disparate materials/objects
* Exploring under surfaces
* Visually model a narrative
* Hu-tube (Humanities YouTube)
* Archaeological sites
* Theatre stage sets
* Virtual sculpture
* Tele-immersive done/other performers
* Narrative/historical events - library
* Geographic locations
* Buildings (including unrealized designs)
* Soundscapes (to represent spatial structures)
* Computer Art
* "Puzzling" - making virtual whole from parts
(pb1c-010 - 011, pb1c-072-074)
Commonality: finding & comparing
* Narrowing in
* Institution (rumor; follow a track)
* Fuzzy finding
* Precise finding
* Incidents of evidence vs. corpus
* Build a personal [bricolage] corpus
o Iterative exclusion & inclusion
o Consult with specialist
o Consult with peer
o Changing nature of arguments
o Limitations of tools
o Evaluate discoveries
* What is the recipe? Is there a recipe?
* Does methodology documentation 'count' as research?
* What is uncommon?
* What does this mean?
o Unsystematic systems
o Might look more like art
o More to life than a corpus?
o Can you mark up for color?
* See pb1c-072 for diagram graphing easy/hard and needed/unneeded
* Black swans and outliers
* Unexpected use - paleography, genealogy
* Assembling shards
* Finding unpublished materials
* Asking individuals to map their experiences
* (finding new sources of material)
* Non-indexed materials
* Discovering your archive
* Is the whole of the humanities curious?
* Relationship between art + research [artists don't have to care about the facts]
* Are there other ways to write history?
* Are we all just here because of a (powerful) metaphor?
* Or is it all about resources?
* Is it just a matter of being discipline-specific?
* Common corpus varies
* "The uncommon is often the case"
o Need for spatiality
o Personal use project
o Limiting cost for the personal
o Is there a reward to digitize?
* Burden of digitization / use of hybrid materials
* Types of hybrid materials
o Disparity in metadata
o Live interpretation
o Recorded interpretation
* Works easier for images because I might reuse them
* If usable for teaching, I will digitize
* Cost to access restricted images
* Timeline - list (document)
* We go to lowest common denominator - not always paper
* Build shadow system
* Social sharing of smaller online collections > personal information collections
* + the ability to annotate (categories + tags)
* "tagging + bagging" + ordering
(pb1d4-01 to -04)
How does faculty - library - IT collaboration work?
Infrastructure and collection building, e.g., ARTSTOR and JSTOR
What exists? | Where is a maintained IN/OUT library?
Where does Bamboo fit in/serve?
How does faculty - library - IT collaboration figure in individual teaching and research?
Light Reading (vs. Closer)
Copy/Text citation |
Browsing "the stacks" |-------Quickly, with fluidity
Scan for main points |
See context |
Taking notes |
Libraries and Faculty and IT working together:
-how do we decide what goes into a digital/institutional repository?
-should be addressed at the start of the project
-how do we establish structures to set control over such structures
-stability of citations is also a social function -> authority
-analogues to chaos/confusion following print revolution?
-role for Bamboo
-reviews of digital tools?
How does the individual teaching/research interact with cycle?
(pb1d4-05 to -07)
1. Problem Def
Define The Problem
-lit. review |
-discussion | <- problems at edge of discipline
-exposure of/to ideas |
-Problem solved? Y/N
-No - assemble data and resources
Scoping/Assemble Team and Data
-individual or partnership
Complicate the problem...
How to build a partnership and community
Broaden and Narrow Scope
-How many people?
-How much interdiscipline
-Edge? What do you discard?
-"Team" many people
-Solo - All in 1 person, 1 place
-Discovering what is not recorded
-What is not being done
-Who is thinking about the same
-Incentives to cross boundaries?
-Modes (1 or more)
-Multiple (vary by discipline):
-Editing - peer review + response
-Intellectual Property Reuse
-Durability of form - preserve?
-Resistance to "new" -publishers
-Accessibility - findable?
-Impact? - How to evaluate
-"Ancillary" - Bibliographies, datasets, unedited notes, trace paths
-Portability of product
-Defining responsibility (Beyond individuals)
-Limitations of any format
(pb1d4-08 to -12)
D. Editing (Google Apps)
B. Call person to collaborate with;
Find people to collaborate with
C. Agree on how to frame the question
-Define languages and terms
-What is the output of the collaboration?
-There may be a size component. How many people?
-Ambiguous methodology "valued" by Humanities.
-How do you maintain the ambiguity when appropriate?
Social Networking Tools
How to determine who to collaborate with?
How to involve students? (undergrad and grad)
Are there challenges in sharing documents?
-file space across institutions
-checking in and out documents
-change management/version controls
Collaboration with people outside
-community of scholarship
-Rights and permissions of data/content
"The problems should be finding us"
Bamboo as aggregation site for the tools that are out there
(pb1d4-13 to -16)
Using Tools to Make Associations
-Discover categories of contacts suggested by a scholarly object
-Review secondary literature
-make new "relevant" associations in secondary literature
-Find patterns in text
-Set aside/Weed out unreliable, or undesired material (for a particular scholarly purpose)
-distinguish peer-reviewed material vs. non-peer-reviewed
-filter on trusted sources
-Find antecedents of a work
-"genetic" sense of scholarship
-Order antecedents of a work
-Identify objects influenced by a work
(Make Associations con't)
-Recognize duplicates (or duplicate digitizations of a single thing)
-Identify "is a part of" or "is an instance of" relationships
-Identify cross-language relationships
-Recognize equivalencies (people, places, terms, etc.)
-Identify stylistic relationships
-Create schemata and databases
-e.g., schema to teg
-Create controlled vocabularies
-Form, test, reform hypotheses
-Build predictive and descriptive models
-Journal Articles Graph ]
-Monographs ] Make
-Physical Model ]
-Enrich a "base" model w/additional layers of associations
-Enrich by looking at other models
-Transform/relate between models
1. Identifying stakeholders, needs, authority...antipathy , future stakeholders, needs
2. Identify metadata schemata, workflows, tools to suit those audiences and fixing them when they break or change ("Shakespeare is never solved")
3. SVN/Version as a model for dealing with these challenges that is well-suited for humanities - maintains ambiguity - managing arts and humanities collaboration
(pb1d4-18 to -30)
-Ambiguity - how to manage fuzziness.
-currently - put in arbitrary place - may write article about it
-attach "level of certainty" with a date
-forensics with helping to sift
-date as outcome of research
-involve community scholars
-Assume infrastructure that understands data plots, etc.
Address antic. problems
potentially gen./non-profit audience
(as distinct from sciences)
Acknowledged - horror of prof. human. toward non-professional
Sharing and harvesting
Gen. search engines
from Human. scholar -
Aud. - bks, articles, (???) art.
limits to this.
Ex. Stanford getting out of bus. of Human. publishing
Therefore, Imp. of finding different vehicles/venues of publish. wk dif in disciplines.
Peer review implications
Ex. in Classics learned society pubs.
Extend to human. teaching and research
Ac. publishing may die
Therefore on-line is vehicle
Therefore can be read with different layering for different audience
Can already do
Cost an issue
Therefore publish - find audience and ask them for $
Therefore publication as conversation
Current system failing
Ex. OED's subscription base.
Wanting to change business model and wanting to allow citation digitally
2006 - Aust. gift to nation of $2M - Dictionary of Biography
Ex. McPherson article on Lincoln
scholar concludes - McPherson writes better
1 Bil hrs. Wikipedia
Audience = author
not strict consumer
Creative Commons license
not under your control
?Bamboo assumption scholars will share authority?
Ex. use case - is this open/closed?
?Can you make Wikipedia for my friends
Many ex. of locked down datasets.
-build in levels or authority/access
-how customizable is it
-credentialing evaluate who's good/not good scalably
-therefore palette of choices
-from wiki to less open and different levels at different places
-shift into publishing system
-therefore strategy from wiki
-How to get out of wiki
-Modify level of access
-Define communities of people
? Are we capturing product and process
shift concepts and migrate to new tech
-Wiki on top of repository contrib. to wiki - through my inst. repository
-Ex. video in IR
-streaming server and get comments
-IRs that are aware of other services we don't want to re-build
-IR - 1. Turn over to institution
2. Community repository
Policy ? of data collection about use - crossing after migration
ex. - OBD - could pt. to published bibliography in IR
? Exit for stuff we don't want to maintain?
Deposit agr. like Flckr selection
Naming of parts
Ex. 10K named attributes
-migrate impossible to translate into different system
-therefore abstract IDs only
? realized complex analysis we'd envisioned
Ex. Text mining - sell grad students
while clock ticks -
depending on the data -
therefore Bamboo ethical to suggest possibility of research output
Exit strategy as reincarnation
Can't preserve data - lets preserve scholar
But what does Tibetan scholar come back as?
If we had IRB inhumanity - never get anything
Ex. Video project - first metadata
Cleaned - made it offensive to videographers
Ex. "entertainment" for religious subject
Ex. Nation Museum of Australia
Azaria Chamberlain's dress
Catalogued - w/context to story
(pb1d4-31 to -35)
-How research questions are found, formulated, structured
-Now vs. then
-Practice #1: Guide grad student to initial topic
-Mine text enables
-Old: registry of dissertation topics
-Subject matter expert in library
-Advised grad student
-Vetted/trained? by faculty
-Practice #1 (cont'd) guide grad student to initial topic
-old: +w/ MLA, go back year by year
-now: exhaustive search infeasible
-digitization of past works
-hard to keep up, i.e., search
-how to filter, make sense of, search results
-ARTFL project example of search evolving with body of material
-Collab. to modify them, another topic
-Wiki-professional (tool for distributed metadata creation and analysis)
-name authority; loc. vacating
-facets, semantic web
-e.g., contracts, cross-national, cross-cultural
-incomparability across fields
-tool (human, algorithm)
-reflection of author's interest?
-subject tagging varies between disciplines
-descriptive metadata with distributed authorship
-fashions in funding
-movers and shakers decide funding priorities and acculturate grad students
-issue: how does a hit get near the top of the list?
search for methodologies
-issue: regards don't incent proper attention on Qs and objects, e.g., improving search/discipline, digital objects
-digital skills valued outside Academia, only
-sociology of the professions
-it does change over time
(pb1d4-36 to -38)
Creating A Text to Data Store (As Secondary Sources)
-Identifying primary material (rapid development)
-Making material 'digital'
-Defining data structure
-Developing use cases
-Exhibit development (diagram, tests, etc...)
-Identifying documentation tool
-Defining work product -secondary source reference work
-Identifying features of interest
-Leave 'hooks' in design to allow for future innovation/re-structuring/rapid development cycle for concepting
-Prototyping data structures (e.g., intermediate environments: diagram, spreadsheet)
-Building data model (based on exhibit and prototype data processes)
-Requirements gathering for structure and user interface
-developing phased project plans
-Compiling sample data set
-Defining query model
-What do you want to ask?
-Flexible interface for searching
-Populating structure with "tagged" data
-Commingle material from multiple data stores
-Multi-faceted display interface for query results
-Alternate discovery/query mechanisms (non-surrogated inquiry)
-e.g., pattern recognition
-Commenting and review (hypothesis testing)
-Human analysis of content
-Definitions of : -problem
-division of labor
-e.g., Community of Science profiles
Wish list - outstanding issues
-Export utility for 'snapshot'
-Generic flexible data storing w/minimal structure