This wiki space contains archival documentation of Project Bamboo, April 2008 - March 2013.
- [ ] unpack a common -- sensemaking
- [ ] metaphors are like perceiving, walking down the street, cartography, journey
- [ ] *this is something that scholars want non-scholars to understand*
- [ ] useful distinction with meaning? probably not for this discussion
- [ ] statements of kinds of primitives
- [ ] promote reflexive awareness of sense-making as a process
- [ ] is it a procedure? depends on discipline, scholar
- [ ] what are you making sense OF?
- [ ] do you have to start with a dataset? Isn't that an assumption in and of itself? what about starting with a phenomenon? a text corpora?
- [ ] what about starting with an observation or an engagement?
- [ ] "data" is methodologically marked -- but materials isn't good either.
- [ ] what counts as "data", and what should the "data" count as?
- [ ] recognize that the operation is recursive
- [ ] a kind of accountability, somewhat dialectical -- accountable to what you're looking at (letting it speak), and accountable to understand one's own biases
- [ ] need to take responsibility for the fact that you're using a certain category
- [ ] might scholarly works need to be accompanied by statements of biases, assumptions, contexts as far as one understands them
- [ ] somehow, a question occurs to you -- something to understand. or, something needs to be understood
- [ ] using knowledge of others lacks context, assumptions
- [ ] involves violence, like smashing
- [ ] abstracting, annotating, assumption-linking, concept-splitting, emitting, extracting, goal-shifting, linking, matching, negotiating, segmenting, source-linking, stemming, zoning, ordering, articulation, a concordant discordance --> plotted into a narrative
- [ ] abstraction involves loss, filtering, bias -- contrast with concretization. details get filled back in when things are made concrete
- [ ] might look like thinking, or typing, or hand on the forehead, or an action, or like art-making
- [ ] must somehow be understandable and shareable
- [ ] don't rein in methods of sensemaking
- [ ] sometimes you have to be a bit more efficient and come to some agreements
- [ ] how might you describe the mona lisa to someone who is blind?
- [ ] curator, critic, historian's perspective
- [ ] multiplicity can be very helpful
- [ ] boundary conditions on interdisciplinary work
- [ ] analytical results are missing context, feeding them an example can be helpful -- digital tools need to consider how a scholar might verify and analyze the assumptions of the tool itself
- [ ] what can really be "done" in this realm?
- [ ] make assumptions clear
- [ ] think about how sense-making can be supported.
- [ ] allow verification and testing and tinkering with what is presented
- [ ] measurability as a form of force
- [ ] one person's process is another person's ritual....translation is difficult!
A8: scholar; moved into producing infrastructural materal, dealt with issue of building new infrastructures where people are still using their old-style structures
D20: foraging combines discovery and serendipity; that kind of investigation can inform both linear and untraditional forms of understanding
D3: libraries as places to forage
B1: Chinese art historian: she runs into challenges with foraging, seeking stuff and also working in nonverbal script, non-Western language
she's interested in producing visual content
C2, archeologist: deals with complex scripts that need decipherment
D7: he likes the verb "foraging"; it applies on different levels to the cognitive process
he produces CDs, self-sustaining projects; producing stuff with public television, etc.
MS: connection between libraries and museums
A10: foraging is artistic process and also idea-finding and -incubation
A11, librarian at large state school, runs institutional repository; foraging provides a way for her to find value in the weird stuff she has in her repository
her angle on producing nontraditional materials is sustainability; she's involved in the ebook movement
foraging as a practice guided by a specific theoretical frame? opposed to a more structured search process...?
it's searching, discovering more broadly construed
there's the way you're "supposed" to do it and then there's the other stuff
what D3 wants to test is, are these urban neighborhoods rural-agrarian societies plopped down into the city, or city-style neighborhoods? This is a politically informed question
bringing it back to technology: archeology has this problem of managing a huge amount of data
and how to find from other people's excavations parallels between your work?
this is v hard to do digitally because every archeologist has a different framework
so they have a huge information integration problem
so the question is how can tech help our foraging integrate data from other areas?
A12 has a vague image of foraging digitally and being able to see other people's taxonomic structures transparently associated with a given item
global schema to which local heterogeneous schemas can be mapped
a very conventional old-fashioned database problem
is there an appropriately abstract and universal global schema that would let you map local taxonomies to a more general tax.
A8: they have this problem too in ling; there is one ontology becoming more and more accepted
the task of the researcher isn't to reinvent ideology but to provide a mapping
and another problem is search: approaching a situation where no one has to throw anything away
so how do you find the valuable pieces of information when you archive everything?
foraging produces research that doesn't fall into already-researched paradigms, so that can be the object of the foraging
the library world has plenty of thesauri
even a small number of baseline metadata that everyone agrees on, v. broad, is v useful
could we please parse language with computers, please
new tech is coming around recognizing images
even these MD standards and tagging schemes themselves are products of particular communities of scholars that don't cover the waterfront and are "historically contingent products of scholarship that are ultimately ephemeral"
so we shouldn't build structure on ultimately ephemeral things
ethos: part of what we're suggesting is foraging as a typically nonvalidated research method needs to be promoted as worthwhile, and that foraging tools should be available
hard time imagining a system that doesn't have as its apex or subbasement a standard set of terms
• Engage users in folksonomic tagging, giving meaning to a scholarly object, identifying the value or significance of a scholarly object
• Engaging people in disambiguation or correction of non-automatable data
o Might look to a member of the public like "playing a game"
o Might not be so interesting to people ... must distinguish between scholarly and general-public communities
• Engaging people in transcription of practice
• Performance in public
• Service-teaching or service-research: combining or orienting scholarly activity with community service
• Engaging with students in a teaching-university in and of itself "transgresses the boundary" of pure research
• The boundary of the academy is made permeable by the border-dissolving nature of the internet; and also by prevalence of life-long learning ... the geographical and "must have ID" barriers are dissolved
• Establish two way processes that allow engagement with a "public" outside the university (e.g., w/o university-specific "credentials" / authentication/authorization)
• Intellectual property issues limit openness of engagement / permeability-possibilities ... so approaches to public engagement must enable different degrees and kinds of access to courses, data sets, archival materials, etc.
• "Discursive forms" ... different types of events that invite engagement might be a way to think about what scholarly activities address or approach public/community participation.
• Failure to allow for ingress of information/review/response results in stagnation within the academy (relation here to "smashing" theme ... disruption of status quo)
• "Unconference" at George Mason University ... "That Conference" ... where the topic
• Related to how recognition of digital scholarship is not allowed into tenure review process.
• HUGE % of network traffic from university is traffic to/through facebook
o Create connections between students in the academy and events of interest to faculty or otherwise "connected" to faculty, e.g. "art opportunities in the city of Chicago" (Eli)
o If you pretend Facebook isn't huge you're missing a great deal. Academy must adjust to the culture around us.
• Who should be researching the issues inherent in lack of participation of women in certain fields, e.g., Computer Science? (B7)
• Service learning, service trips, internships (putting students into spaces outside the academy)
• Public being the subject of research ... some conversation here about whether this is a good or a problematic thing ... perhaps better to say that public is the source of information of interest
• Explaining to the public why and how work of humanists matters ... to make sure there is public support for funding agencies like the NEH ... but does drive toward relevance dilute the intellectual value of a scholar's work. Oddity of humanities scholarship is that it operates in a way that values scarcity: a thing that is little-studied is of more value than something that has been thoroughly chewed over.
• Q: is public / community participation a "common" theme in humanities scholarship?
o There seems to be a general agreement in the public that universities are valuable ... people want their kids to attend
o In visual arts, certainly ... anything that requires an audience ... but perhaps not all disciplines are engaged in making public / community participation happen
• Any field can benefit by delivering what one knows to a broader (including a public) audience. Might Bamboo facilitate delivery to that broader audience? What about delivery platforms for non-traditional artifacts of scholarship? A filtering mechanism that allows people to find/identify the scholarly artifacts that might interest them.
• Bamboo could track public use of a scholar's work in order to attach value to her/him vis-à-vis tenure review
Unpacking a commonality: Authority and Validation
Again, this group was good about writing on the flip chart.
As the discussion unfolded the group seemed to be concerned about the dangers of imposing or creating a digital publishing standard via bamboo. They realized that at the basic level it was desirable to have some seal/ certification that would give digital humanities research and production greater valence in the academic world. However, too strict a set of rules or too regulated a world would mean the loss of true creativity, or political anarchy.
All participants also expressed horror at the prevailing prejudice in the academy against digital production. Tenure rules need to be changed, professors need to take digital publishing and production seriously. This group saw Bamboo as a project that will REVOLUTIONIZE and ALTER the humanities academy today. This is the promise of Bamboo. The academy needs to move (pushed by Bb) into a model closer to the sciences (in relation to technology).
While Bb is doing this it should however do become too rigid a structure. There should be a structural flexibility that would allow it to keep growing and changing and also allow it to include different models of scholarship, fringe scholars, etc. features like a sand box area for pure play.
So Bb needs to create transparency but also retain dynamism.
Voice of the Shuttle
They want Bamboo to create community of scholars---critical mass- to tackle issue of authority and validation in the academy (in relation to digital publishing).
Originating New Works/Intellectual Property
D15: JDs need to be hired in Libraries to provide information for faculty.
Faculty can be educated on how to protect their IP and re-negotiate the contracts with publishers.
D16: When does what you do become original? In terms of being called a creator.
D9: Libraries are also creating new works. Indexing, curating, metadata etc. Are all intellectual work going into that process.
Libraries should push the fair use envelope.
Some publishers allow self-publishing on faculty own's site.
Bamboo must start from base-line of CC metadata. And keep the content rights for a later discussion.http://www.dlfaquifer.org/ which is a pioneering technology.
DLF: Venture Philanthropy - participants do pay to play.
A Bill of Rights for media/image usage.
D13: Libraries as partnerships.
D10: Putting archives online is new work.
D6: Intellectual Property often stymies original work - in digital formats. Whats going to happen when university scholars start putting up edgy stuff and a cease and desist letters. Will the uni. stand behind them?
Bamboo will have 3 problems: Using copyrighted work, creating new work from copyrighted works, policing copyright itself.
Faculty want to know how much risk they can take?
A License which would be used for making scholarship.
D17: Scholarly publication in the Libraries was there to protect the scholar. There is a tension between the University and the Scholar and their rights. Digital Objective Identifier.
New model at my university - Mellon's vision
D18: We need a more comprehensive policy in IP. Protecting copyrights is destroying. Who owns the online courses.
▼ 4. Exercise 4
▼ 4.1. Pre-discussion...
• 4.1.1. Ethnographic practice not just about "these people do certain things" ...
▼ 4.1.2. Useful vs interesting.
• 220.127.116.11. In a teaching setting, interesting is per se useful
• 4.1.3. A good idea should dominate a bad idea
• 4.1.4. IT people come in and say we have all these good tools
• 4.1.5. So what, you haven't heard the problem yet
▼ 4.1.6. Where do you learn new tricks?
• 18.104.22.168. from your colleagues
• 4.1.7. G1 never presents the good ideas, he gets an agent.
▼ 4.2. Topic: challenge assumptions, hypotheses, and meanings.
▼ 4.2.1. G6: one thing I do: listen to music that's not written down, and write it down. Group it into phrases. Find regularities, departures from the norm. Explain the variances.
• 22.214.171.124. Part of work is to develop new categories according to individual discoveries.
▼ 4.2.2. G4: project on housing
• 126.96.36.199. Observe the unwritten and undocumented
• 188.8.131.52. Some kind of recording
• 184.108.40.206. Listen / observe / look at / whatever is appropriate
• 220.127.116.11. Until patterns are observed.
• 18.104.22.168. G6's workflow sounds very familiar, despite being a different genre.
• 4.2.3.G6: the more you engage people, the more they tell you how to behave and interact to be a decent human being.
▼ 4.2.4 .G3 is in W. Jakarta, looking at martial arts.
▼ 22.214.171.124. Is it totally different?
• 126.96.36.199.1. Depends on whom he's asking, when he's asking, what he's asking about.
• 4.2.5. G1: so there's a prefunction, must understand the context.
• 4.2.6. G3: not necessarily discrete groups.
▼ 4.2.7. G2: at a broad level it's similar
• 188.8.131.52. Don't do field research, empirical research
• 184.108.40.206. Look at questions, do searches
• 220.127.116.11. We should do more fieldwork, talk to people more; there is criticism of our approaches.
• 4.2.8.G6: if I were a practitioner, composing, improvising, I would be part of a broader field of cultural production. Would have an idea of what might be missing within the subgenre (say, rock and roll). Challenge people's ideas about what a good R&R song is or what a good melody.
▼ 4.2.9. G4: sister is a choreographer. Sometimes it's just soul, inspiration. But on the other hand, she does a lot of research, similar to us.
• 18.104.22.168. What distinguishes us from humanities scholarship is that they go in to a text, and it doesn't talk back.
• 4.2.10. G8: what do you mean, it doesn't talk back?
• 4.2.11. G4: elaborates on interactive, time-consuming nature of building a trust relationship with a subject. Books are not engaging in the same fashion.
• 4.2.12. G8: spend time with book, knock head against the material. It's messy.
• 4.2.13. G9: Choosing what to listen to is one of the most important stages. (Or what text to read, etc.)
▼ 4.2.14. G3: It's not just a choice, it's a triangulation.
• 22.214.171.124. Anecdote: Was talking to someone about something, the interviewee said "well it's not exactly like that"
• 126.96.36.199. "But that's what you've been telling me for years"
• 188.8.131.52. "Because that's how you've been asking the question."
• 4.2.15. G1: case law is about judges making decisions. Divergent pieces of case law, figure where judges are coming from.
• 4.2.16. G8: different types of outcome. Different looks & feels for observer of outcome.
• 4.2.17. G6: if you're writing a book, you're trying to make a piece of cultural production that meets expectations of your audience. You may have to take another try, or you may have contributed to the field. Perceiving regularities and acting as much as you can according to what you perceive as customary; you succeed or fail.
• 4.2.18. G1: can be emphasis on regularities or irregularities. Latter can actually lead to shifts in common expectation / norms.
• 4.2.19. G6: legal precedents don't create answers, they create more questions.
• 4.2.20. All: that's the research cycle.
▼ 4.2.21. G9: haven't yet mentioned technology. Where does it fit in?
• 184.108.40.206. Making the corpus
• 220.127.116.11. Digitizing the material
• 18.104.22.168. Listen and record
• 22.214.171.124. Identifying regularities and departures
▼ 126.96.36.199. Is it more in the gathering or in the knocking of heads against things?
• 188.8.131.52.1. One of the things that technology brings a benefit in is: when you have a huge pile of material, the unassisted person can only scope one thing at a time, can only see things in the order they come. Technological assistance can help see patterns across materials.
• 184.108.40.206.2. So you need to have the raw text in digital form, but also you need digital reference sources.
• 220.127.116.11.3. Alternately you may need special sorts of queries
• 18.104.22.168.4. Take advantage of new kinds of structure: GIS, for example. Invented for physics, but no reason you can't use it for humanities.
• 22.214.171.124.5. Visualizations
• 126.96.36.199.6. Frontiers
• 188.8.131.52. G9: conference experience of nobody presenting on what are their research tools, but presenting new tools that are never followed up upon. Couple years later, it's all washed away.
• Unpacking common theme - Acquisition
o Two levels on broader plane:
• Acquisition at an institutional level - acquiring materials
• Acquisition at the scholarly level - acquiring sources
• Difficult to understand this without having a sense of how scholars are going to use this.
• Creation works more because it touches on other common themes when discussing discrete scholarly practices
• Touches too closely on modeling and visualization? Perhaps creation isn't the best to unpack at this point.
Unpacking creation, take two
• Category of practice - compound scholarly practice, what are the discrete scholarly practices that comprise it?
o Not necessarily something that a computer might do, or that technology might aid.
o Must maintain a clear scholarly objective.
• What do scholars do when they create?
o Write books - F9
• Specificity - Collecting, synthesizing what you've done, in terms of papers, other publications. Selection, prefacing, editing and gathering
• The process of writing a book is not something that scholars really engage in all that often.
• But these are different from the authoring a book
• Instead, perhaps, writing a scholarly monograph.
• Spending years researching a subject inside out
• The research cycle might be one's entire career.
• The process of writing a book doesn't involve very much writing.
• Visiting places
• Talking to people
• There is a wealth of reviews, informal and formal, of interaction that happens in the process of writing.
• Teaching a subject for years is part of the process of writing.
• Ideas come from interactions with colleagues and one's own students.
• To teach a course in something is, at times, a research methodology in and of itself.
• Step before collecting when writing - locating
• Annotating? Where might it fit?
• Essential to organizing new material, and may feed into synthesizing.
• Happens while collecting the stuff.
• Plenty of steps which make it easier to organize and find information/material
• Scheduling, in order to meet the funding requirement - F10
• Perhaps not such a restraint.
• RAE - Research Assessment Exercise
o Finish prior to RAE in Britain.
• Reviews are of the institutions, not research teams, nor solitary researchers
• Similar process at CNRS in France.
• May be for a grant period.
• Perhaps finishing in time to present at a conference.
• Scheduling can also be important at the level of the individual scholar
• Generally necessary to travel, to research at other institutes, etc. - F8
• Data set held elsewhere.
• Getting access to relevant archives, places, where "stuff" is held
o Particularly important in art history and architecture.
• But this is the "old way" of doing things
• At the same time, the current way for many researchers.
o What are the current practices that might be facilitated by new technology?
• Provide alternate means of access to rare collections through technology and visualization
o Possibility of virtualizing some data sets and collections can play a key role in humanities scholarship
o Much art scholarship can be done with prints, as one day one might be able to do architecture using visual models.
• There still may be value in visiting the original
o Using library, using surrogates (like Google).
• Cultural restitution and reunification of artifacts -F8
o Must visit physical artifacts is a crucial part of research.
o Assemblages of sensitive artifacts can happen virtually.
• Codec sinaticus project that digitizes sensitive materials as such in order to be reconstituted virtually.
• Many of these tasks are part of research cycles that are not related to the creation of books
o Specifically in the output of audio/video.
o One may have to convert formats - standardize, via having things digitized
• Some early audio/video can simply not be played and need to be digitally redone/recreated.
• A/V formats which go out of date need to be digitized and updated more often than print sources.
• Highly involved with presentation and curation
o Do scholars preserve? Or is it done for them because of the scale?
• Where agencies and libraries come in.
• Resources may sit on shelves if in the wrong form
• Requires a funding bid to be put in the right form
• Scholar is required to demonstrated a scholarly need in order to transfer/change formats.
• Clearance of intellectual property rights.
o Usually an issue with images of any kind.
o Some sort of model licenses, and ways of facilitating this process.
o Obtaining access to an archive is a specific and older flavor of the same thing.
o Can gain access to archive, but what are rights with the resources.
• Rare that scholarship is stopped by rights issues.
• Roles of who does what?
o Not quite what we're looking for at this point
o Ideas of these workshops is to get multiple roles to the tables.
• What are the IT activities as part of creation?
o Publishing in the broader sense - much of it is digital. Issues of repositories for storage and subsequent retrieval.
o Speaking specifically of the publication/product of scholarly activity.
• If one wants tools to help with scholarly activity, there needs to be expertise provided as a reliable and valuable services.
o Libraries do this, but do they do it in the right way?
• National libraries are there to preserve materials "forever", thus expertise is required to move from platform to platform.
• Often formats are those that aren't needed or desired by scholars. Continuing to deal with hybrid materials.
• Digital dump of raw materials that scholars can then sort.
• Often there are tools that are asked for afterward.
• What are the priorities in digitizing?
• Technology can create extra tasks which are gratuitous from the point of research
o Type-setting as an example.
o Part of the research process, copying everything.
o Making notes on something is so often copying bits of the things that you're reading.
o Transforming formats for conservation involves copying things.
o Prepare it for students, another copy
o A talk, another copy or version
o Perhaps a subset of copying.
• Incremental changes of versions in copying.
• Part of evolution
- [ ] ex4 -- the commonality
- [ ] serendipity
- [ ] narrowing in
- [ ] intuition
- [ ] obsession
- [ ] fuzzy finding, not just database finding
- [ ] precise finding
- [ ] incidents of evidence vs corpus
- [ ] personal collection -- a personal corpus bricolage. not supposed to be exhaustive/comprehensive -- materials that resonate with the scholar
- [ ] iteratively including and excluding materials
- [ ] consult with specialist
- [ ] consult with peer
- [ ] several tools for evaluating arguments -- disproving can be far easier
- [ ] nature of arguments is changing -- not just a matter of force or logic
- [ ] tools are limited -- can't replace a librarian, can't ask you what you mean
- [ ] evaluate each discovered item
- [ ] engage in disputation
- [ ] what is the recipe? Is there a recipe?
- [ ] Does methodology documentation count as research?
- Deciding on instinct and judgment, and associated with cost/benefit
-"Instinct" literally doesn't get us far; experience-based decision making
- What do you get by doing something that's hard to explain in a certain instance, but built into long pattern of experience
-Give students little model experiences > this becomes a basis to see what's going on
-Cylinder seals - as graduate student, before digitization, having to go through box in museum to describe these; if they'd had the luxury of digitization, they never would have had the most valuable insights
- Manuscript immediately tells you something about how hastily it was written; printed version is not going to convey in the same way
- You can tell the other story too, to see palimpsests that no human eye can see
- Making some heuristic judgments about avenues to pursue
- Think carefully where you are at every stage re: the question you're asking
- Kind of evidence presented to you, order it's presented, shapes how you shape a question or frame a problem
- Working from 19th cen. biography, hanging quotations suggests they left something out
- Working from high-quality scholarly edition from 5 years ago - not going to consider that as a possibility
- Some key piece of information, something's missing from evidence you have
-99% of the cases, surrogate is good enough
- If I'm using something like microfilm, and am looking for articles, how fast do you turn the reader?
- One of the things for scholarship: control in some sense for interactive technology support
- You don't get context on-line > Need to see page in relation to ads, juxtaposed with other articles
- Allows new opportunities to juxtapose things that physical version can't.
- Huge commercial sector that provides content for us
- Less/no control over, input into
- We're asked for input from commercial providers; trying to get that feedback from faculty/researchers/etc.
- IP rights management > Altering interface to stop you from misappropriating it
- Tendency of technologically mediated sources > more bite-sized pieces
-When talking about issues of research process, this kind of issue becomes important re: workflow
-Important to be able to share representation you're using
-Not implicit in the same way
-Citing electronic resources differently; move around differently
- Academic work should be explicit about judgment processes
- Footnotes tell a certain story, introduction makes some of assumptions clear
-At some level, you're supposed to make the analysis work without having all that laid bare
- One of my colleagues was asked why she wasn't explicit about theoretical assumptions: "I don't wear my underwear on the outside."
- What other clues do you use to assess whether things are worth pursuing?
- Write title and reviews, but often only Amazon
- Journals on JStor, can usually make a list of 5-6 decent reviews
- Interesting idea for Bamboo - hotels.com for reviews
- If it's a field I don't know 100%, I go see if people I trust are in favor of it
- Don't need to find it; unless you're doing an article on how something has been reviewed
- Cross-check validation: you have an instinct, but you can't go with your gut all the time
-You cross-check, try within reason to validate independently your judgment
- If my judgment is that I need to go to California for an archive, I need to convince someone through a peer review that I'm worthy of money for that trip
- Part of software development is to step back, find out how to explicate these steps and requirements needed to perform these steps
- Term "instincts" has raised some issues; "tacit knowledge" is better?Exercise 4 Scribe Notes
Goal: Interdisciplinary Collaboration (S1)
♣ Define research questions
♣ Find collaborators
o How - social networks, publication, web presence
o Make it easier - Facebook, Linked In, Bebo - like tool could be useful (Proneto, Plaxo)
o Decision - Rely on who knows who (get a recommendation)
o Challenge - How to find experts outside of academia - "We need an extensible network that reaches inside and outside academia." (A1)
♣ Define common questions/research focus
♣ Establish common language and terms
o How - brainstorm, repeated explanations
♣ Develop shared methodology
♣ Decide what the output is - the onion (communication to different communities)
♣ Communicate with one another
o How - video conference, Skype (big impact on collaboration)
♣ Sharing of materials and documents
o How - Sakai, others (sometimes a problem to share across institutions)
♣ Collaborative editing
o How - Google Docs (good for versioning control which is critical)
"My major critique of Bamboo is that it doesn't sufficiently emphasize the diverse and multiple uses of tools and material, and the ability to pull different sets of data captured for different purposes into single sites for new insight." (A1)
"Best practices from exemplar projects - the September 11 archives for example - would have been a good place to start this conference." (S1)
"Often the problem finds us. You have a problem and people are looking for expertise and collaboration. Climate change impact on fishing economies, for example." (A1)
Archives and communities are not separable. We should be building tools to help use these materials. (A1)
♣ Trying to get people to use a common tool set - possible solution is to enable discreet efforts and enable collaboration
♣ How do we involve students? (S1) This is always an expectation/assumption in the sciences. How do we do this in the humanities? Very strong tradition at small liberal arts colleges.
Developing Scholarly Practices
Creating a Data Store as a Secondary Source for Research
NB This is an iterative process, well actually its a Rapid Development Model
define question, inquiry
locating primary material
determine media forms - digitize as necessary, co-mingle from as many sources as necessary
store data in a generic form
define goals/features for the data store
define strata of metadata that allow for reuse and new tools
define metadata structure/ schema
building data model/ tag data
populate data structure
Comment/Review/Refinement Process - Hypothesis Testing - ITERATIVE
Developers can also build a view of the database - visualization instead of open interface
Black box and phased development
Interface to the data that allows us to "go fishing" (M2)
Interfaces that teach people how to search (P1) - pop up menus on the search bars to give users additional information (ex. did you want biblical or modern hebrew?)
How do you you extract structure from the data stream?
What is the ordering principle you imply - relational database, human reading
Can span projects, institutions, cultures . . .
What we need:
Prototyping Tools - workframe, workflow, versioning
Data Mining Tools
Generic Data Holding Database
Scholarly Collaboration Practices
passing digital objects requires
same issues arise when researchers approach IT with a research need
how do you provide access to tools, develop tools, share time/$
People to People Squishy
2 - additional discussion on presentation
3 - part of survival in the ecosystem includes presentations and publications
4 - there is an energy flow within an ecosystem
Defining the problem
5 - how do we know that there is a problem?
1 - agreeing on a common text, artifact to study
11 - Valuing opportunities for reflexivity---the ability to stand back and assess aspects of one's own behavior, society, culture etc in relation to such factors as their motivations, origins, meanings, etc.
6 - in some disciplines, the first step is determining what others have done in a given area of work.
4 - within the classroom, the learning community is exposed to a rich soup of ideas
9 - going to the MLA for review of what others have done
6 - going to one versus many different sources
9 - the more interdisciplinary, the more likely the need to go to multiple sources to review the work of others and to determine if the problem/question has already been solved
5 - determining the kinds of data that are needed to answer the research questions
1 - the problems at the edge of the disciplinary boundaries
8 - the greater the interdisciplinary, the more barriers and challenges that are faced
10 - from the scope of the work comes refined questions and an understanding of the kinds of resources (people, technology, etc) needed to take on the work. Complications that come out of working with other people who are either inside or outside of the same field
1 - articulation of the problem
11 - in a recent review of the concerns and opportunities facing the interdisciplinary research among the education and neuroscience communities, for example, Varma, McCandliss, and Schwartz (2008) conclude that "even if the four scientific concerns-about the commensurability of the methods, data, theories, and philosophies of the two disciplines-can be surmounted in principle, the four pragmatic concerns (i.e., costs, timing, control/esteem, and payoffs) suggest that doing so will be difficult in practice."
5 - something must be completed before it is indexed in, for example, MLA
1 - a lack of awareness of what is happening on a given campus. People are not aware of the work of their peers.
10 - heard people from one discipline state that they know more about the work of their peers from other universities than they do people from their same university who work in a different discipline
Discrete practices in articulation and presentation of work
8 - taking advantage of the complete resources of a technology and/or tool; multiple forms of modes and/or modalities
6 - conference paper, working paper
9 - increasing examples and work in/or visualization
6 - new forms and modes of argument made in
5 - resistance and/or capacity limitation to new media types 6 images for a 400 page document
3 - coins found at different places in the world
10 - new forms of scholarly output
4 - it has to be robust and reliable, cross platform
6 - how we publish is based on existing systems, but there is also the work that exists before the journal article is written
5 - the intellectual property that informs research and thinking, but cannot be cited directly within the publication
4 - the development of shared annotated bibliography
8 - the durability the form, medium
10 - resistance to new media types
2 - the reuse of materials and resources
1 - the ancillary information and data that are made on the way toward publication
6 - it depends on what has been made. Is it a database, notes from a conversation. Killing the messenger when you excavate. Is preservation part of the process?
10 - questions and issues related to ownership
8 - digital resources are harder to define as property than physical artifacts
Two practices to analyze
2. Gradual release of information
S1 when looking at discovery it's useful to consider how you start a project now vs how you might have done it 50 years ago. Even for how you frame the research question - not just for how you approach the discovery task
R1 gives example of a question that can only be addressed through text mining
T1 As relates to the practices, gives "guides grad students to initial topic or research question"
D1 when doing literature searches may be biased toward electronic resources, away from print
T1 back to the tasks
D1 * Find subject matter expert, including reference librarian
W1 * help identify keywords and search terms, procedures for doing literature search
H1 * new modes of discovery speeds up procedures. The literature has expanded, but the discovery has speeded up
R1 20 years ago, I was able to review the whole body of literature in MLA subject index. Couldn't do that now - can't say now that you've done an exhaustive search given that so much is available
S1 it's all getting better but its also getting more difficult
R1 now with full-text searching, everything is a keyword
H1 does the super-abundance of information make things better?
D2 adds more noise to the system
W1 some application have very sensitive context-sensitive searching including proximity searches. Had a research task that couldn't be addressed by existing tools; had to modify how the tools (in this case the search) worked, and the results of that modification made it possible to do new forms of research.
R1 new tool called "wiki professional" helps individual researchers find projects that use similar tags; http://www.wikiprofessional.org/portal/ "concept web" linking
H1 need taxonomies
D1 traditionally would look to Library of Congress as the "name authority," but they're not doing that any more. (identity management). Have seen the rise of folksonomies at the same time
R1 Library of Congress has recently started inviting the public to identify pictures
T1 do taxonomies arise in other walks of life now?
Replies... "The semantic web"; categories used different practices - e.g. the term "contract" in Law;
H1 taxonomies can help clarify things
W1 Have started to abandon taxonomic based search in favor of intelligent searching. Contextual filtering - not taxonomy based
H1 used to look at an index to find things; now use search
W1 an index tells you how the author thinks about their own work
D1 how you do tags will bias how search works
A2 would rather use a full text search rather than an index to find a subject
T1 building taxonomies may be valuable but has a lot of overhead; would want that to be done close to the author. The value of the index varies according to what field you're in
H1 the index or tags or annotation shouldn't be static. Should be able to be expanded over time - extensible.
H1 what tail is wagging the dog - scholarship or funding?
W1 in some areas there is no funding
M1 availability & interest of funding will vary over time; there can be "fashions"
D1 NEH money can drive the agenda into certain areas over time
R1 funding will drive what gets digitized, and that will affect what's available.
D2 the issue of making indexing extensible over time is very important. Some of the rules that bias search results are not very apparent
R1 it's important to understand search engine optimization in a google world
T1 SEO can be looked at as "gaming the system," but it can be very useful
T1 rather than presenting the "unpacking" as a list of tasks, will present it as the conversation thread that unwound.
T1 the tools and issues of research are legitimate subjects of research
S1 we're now having students producing very valuable technology and research that "doesn't count" for anything
D1 why is there such resistance?
S1 the resistance comes from senior faculty who want to produce people who are like themselves
H1 the work that students do makes them more valuable in the market place, but not necessarily in the academy
W1 some graduate students get hired based on their experience and capability with digital projects, but ultimately not promoted because they worked on digital projects and the "old farts in charge of promotion" don't value that.
R1: Did we make a mistake? It's not using tools, it's not associations, we're doing that to discover something.
T1: What's an item? Take it out of context, and you lose a lot.
-You can create context through these associations
-There are use implications beyond discovery
-Context makes all the difference for understanding it.
R1: But what problem are we trying to solve? What are we trying to accomplish?
-Discovering things that are relevant to answering particular questions
T1: To me, it meant discovery and building the connections that make those associations > either would be valid as a scholarly practice
R1: The associations topic is the most complex and interesting
S1: How about warming up on controlled vocabulary/authority files
F1: Other groups discussed what scholars in the Humanities actually do, what they're researching
-We've moved quickly to how to solve the problems, and not talked about the concrete issues that we get
-I see one group focusing on the interpretations of text > for them, it's important to get to the best texts, compare them, the technology is more secondary
-For historians of literature, the first group relies on them
-Important to establish context (what was going on at the time, same time, in that area), what themes does this artwork pick up, what impact does it have, what is the background?
-Here, technology has a lot of potential
T1: As you get more fine-grained, is this text the same as that one? Two versions created sequentially? Etc
-In library community, we've tried to classify intellectual works > expressions > instances, etc.
-Naming those relationships is another practice within that
F1: In Germany, they do a lot of editing
-Using technology to show various levels in the textual structure
-First version, all the way up, can click back and forth > this is going on for earlier periods
SM: How about starting with tools that create association? Discovery and context.
-What things do scholars do that fall under this theme?
D2: Let's drop use of "tools", and just say "making associations"
S1: When you're interpreting a text, you feel you need to immerse yourself in the context?
F1: That's a specific interest of mine, discovering whatever the text suggests
-Each text has a different demand; geographic might be more important than political context
-A famous work of art has antecedents, trace back its influences
SM: Discovering categories of context as a way of beginning
S1: Don't humanists begin by reading everything everyone else has ever said? Literature review is basic.
R1: We have a lot of bibliographic tools, and they don't accomplish what I want: finding things not as structured by pre-existing description
-If I want to find something relevant to my subject, if I go to a bibliography, only a certain # of possibilities for finding the object
-I think of the word, and the word is in a title/abstract; or bibliography has a structure, and I figure out how it fits in
-These aren't satisfactory; multilingual issues, vocab may not be the words used in the title (synonyms)
-Bibliography structure reveals mindset of bibliographers maybe 100 years ago
T1: But you still go through the process of looking at those
R1: In a full text world, you should be able to do better than that
-Large area of association making by software routines allows people to look in different terms
SM: What you want is to make associations in your search of secondary literature that haven't been made before
S1: You want to start by reading the "important stuff"
-How do you define important?
-Is the "cited a lot" algorithm enough?
R1: It might be the reverse - an inverse citation analysis?
-Ideally, it'd allow me to tell it what it should look for
-I might be saying "ignore citation, look for frequency where things are mentioned"
-User-customizable and trainable so you develop your own profile
-Suddenly tell it "But today, I want citation heavy!"
R1: Not new connections, necessarily, but relevant to what you want to do
-Just find them, and I'll decide what's important
T1: Citation works for some things, he's got different needs on different days
-Need something that's morphable over time, software that works with you to define different metrics
F1: Internet is no longer usable after a certain point for my research
-Can get some kind of bibliography, but then you have to read articles/footnotes, constantly go back to originals, which aren't available on-line
-These are temporary measures at this time, a first step
R1: But what if everything has been digitized? What tools will you need for that?
-At that point, have to refine what you're doing, ask more interesting questions
T1: Interesting if you develop techniques where you can sense that there's a gap of stuff that needs to be brought on-line sooner rather than later
-Start down these roads learning new ways to do it; leads to machine intuition, etc.
S1: Citation tree, "Great Eskimo Hoax" (100 words for snow) traced down to a single statement
R1: Who cites this scholar in relation to this set of vocabulary?
F1: Is there something we can contribute to focus on reliable information?
T1: As a scholar, you do that. Weeding out is a practice. You can get false associations.
R1: "Was this review useful to you?" - "No, he got this totally wrong."
-Electronic posting; next user can find that and comment too
C1: Being able to distinguish peer-reviewed content
T1: Idea of finding and identifying the antecedents of a work
R1: Genetic concept of scholarship
-Should put a bibliography for any particular inscription (figuring out which goes back to the stone, which goes back to secondary sources, etc.)
T1: Part one, finding the things. Part two, ordering them.
R1: Within those antecedents, it's a recursive operation.
S1: Your tool over time would teach you to rely on certain scholars; also be able to find their comments
-Folksonomies where anyone can comment > how useful is that?
R1: You might want to block comments in Chinese, by certain scholars, before this date, etc.
T1: Do you look at the versions that came after, as well as before, what you're looking at?
R1: Depends on the questions you're looking at.
C1: Collecting comparanda
T1: Finding what a work influenced
-Issue of similar-to; tricky computer task of figuring out when you have the same thing
R1: Especially if they've used an OCR engine; digital identicality is zero
T1: "Recognizing duplicates"
R1: The same article published in two different places
F1: Undesired material - sounds like censorship
D2: Maybe "set aside", not "weed out - you might want it later
R1: It's a question of your information management tools
S1: Spam filters - you tell it what's spam, and over time it tries to learn
-Is there any possibility like that?
SM: You can imagine the same type of tool > "this is interesting, this isn't."
R1: Spam filters are just the negative version of recommender systems
T1: Recognizing that what you have is part of something else
R1: Related to multilingual and controlled vocabulary > ability to recognize something in X language is related to something in Y language
C1: Reviewing secondary literature, could be in whatever language
F1: We have the text in English, but scholar recognizes that name is totally different
D1: Cross-cultural or cross-stylistic relationships
-Identifying Appalachian music comes out of British tradition
-Same thing with other forms of art
S1: Recreating/modeling environments?
T1: This has a lot to do with pedagogy, too.
F1: Cuts across disciplines
R1: We talked only a little about human factors, not so much scholarly practices but a "how do people behave" form
-Is that against the interests here?
S1: We talked about applying for grants over lunch
T1: I don't know how to express representation as a scholarly practice
C1: How scholars manage their data sets?
R1: If data sets are part of what scholars do in the future, this is relevant.
-How creating/tagging data to allow the research we want to happen is a scholarly practice
Representing/Recreating/Modeling - scholarly practices and what outstanding issues are there?
C1: Create databases, or at least schemas
T1: Could create schema for tagging set of documents
SM: Most important aspect is that it's plural; it's not One Ontology
D1: Create controlled vocabularies?
T1: Is there a sense that you've found the right model for something (natural model), and things fit naturally vs. having to fit?
C1: You generate hypotheses that stand until you come up with counterevidence
T1: Do you iterate your models, go through testing?
R1: Yes, there's an iterative process
-Analyzing things, figuring out what elements you need > this is intellectual activity
-Only scholars can do it, almost always in dialog with people who think more systematically
C1: Iteration = "tweaking"
R1: Formulation, testing, reformulation of hypotheses
-Not sure if it leads anywhere when it comes to digital tools
T1: If you're an IT person building a system to support a model, and you're told a model is fixed, you might build it one way vs. a flexible model, build it differently
SM: This is fundamental for technologists to understand about scholarly practice
R1: Fields change their interests
-25 years ago, fewer questions about physical object, where it came from, handwriting, etc.
-There's been a fundamental shift in how people think about these things.
S1: Linguistics model that tries to form the past tense in English; but not those kind of models
R1: Might or might not use digital technology; unlikely technology would be formulating the hypothesis
S1: Descriptive vs. predictive models?
R1: I've never been predictive.
C1: I've done predictive; did a search for the predictive feature, then went out searching for what it predicted
-Model and hypotheses
T1: Both descriptive and predictive models
-"All Victorian novels can be marked up this way" - try to find one that doesn't work with it
C1: We haven't talked about products.
SM: What are some of those products?
C1: Journal articles, monographs.
S1: Physical model, descriptive model.
-Had to go out and get additional information beyond what they were interested in in order to finish their product
SM: A process of enrichment of a base model with layers of associations
F1: Fascinating to reach out to an unknown area; interested in what other scholars are doing in these areas
T1: Relatedly, what scholars have to do to transform between models
-Enriching by looking at what else is out there
C1: People are creating GIS in my field for their archeological project
-As we get more projects, we get patchwork quilts of databases that don't talk to each other
T1: Pre-digital: one group had one model, another had another model, needed a collaborative way to transform to understand both sets of information
-In the sciences, you see this a lot
M1: It gets to be too much for the library to manage. The schools take a lot of responsibility for faculty projects. Projects come to the library after their work is done and they see the library as a vector for exposure, which creates a problem of stewardship for the library.
P1: Library issue: question of control. Strategic decisions about collections may not be made by librarians.
S1: faculty make the decisions about what to buy at our university.
When libraries buy books, they buy the container, not the information. In the digital world, construction is different but the practice is not that different, we still buy the container.
M1: but we could make some commitment to serve longevity of that container. At what point does the digital stuff get handed off to the library?
P1: we need to make sure that IT is added to the faculty+library list
M1: when a publication is a series of editions, at what point is there a variant?
A1: paper is patient in the way that digital data is not. a monograph is roughly "done"; a digital monograph is not done because new containers will make people want to do new things with it.
S1: there's a cost issue. with print, cost is about lighting, air conditioning, shelf space, circulation. in digital, costs have moved. equipment, staff skills.
A1: is digital cost comparable, higher, lower?
S1: cost is similar but types are different. costs of maintaining access to the electronic; things break. print stays put unless stolen.
N1: stability has value (print). digital can break on no notice. social stability of citation to print vs. references to digital resources which change or move. social effects: print has a certain authority. author, press, costs to produce don't have clear digital analogs. digital dramatically different from print in important ways. not sure how stability translates into digital. comparison of print revolution to digital revolution: less technological than social. scholarly apparatus, not a technical apparatus.
T1: shared agreements for what one will do with digital works once completed.
project "born digital" which incorporates the release (edition) date into the URL, allowing for future releases not to displace the past ones
N1: decision of a librarian to buy a book is a filter. tools to decide which digital resources to use (reviews) don't exist the way they do in print. bamboo could serve as a sort of gatekeeper; out of fashion but useful.
T1: example of bryn mawr classical review inviting submissions of classical work for e-mail review
A1: what do i know my colleagues care about that i need to pass along to the bamboo people?
what does "light reading" or "being a light reader" mean?
P1: 1 how does F/L/IT collaboration work for infrastructure, collection building?
2 how does F/L/IT collab. work in individual research in terms of life cycle of consult/create/preserve/disseminate?
3 how does individual researcher interface or interact with 1 and 2 - with the effort to create infrastruc/collections, with the life cycle?
A1: are you asking how it works or how you'd like it to work?
P1: taking it as given that one of the things bamboo ought to be doing is saying that we can tie three different groups together. "faculty" so diverse... "chinese" sn't the problem, no-roman script is. unicode.
M1: do faculty care about infrastructure building?
P1: some do, they're here. for them, we agree having faculty involved in infra. building is useful.
not many people are interested in name authority files (all names, all place names in history)
interaction is the process by which an individual does work, they have to consult with tools and infra. availabe; may need to learn to use; may create something worth keeping. may not be more infra., may be articles.
faculty may want to do something but not know how; common problem today
there are processes in a digital context for bringing work through the whole cycle. "this is my project, can you help me with design?" response: "what are the data, processes you need? can we teach you to do them or do you need us to do them for you?"
W1: there are thresholds (size, popularity) where some things rate curation and some things don't rise to that. some things won't have a URL in ten years.
P1: 2 is a central process. at this moment in history, we are still trying to work out the means to work in a digital environment. justifies level of colaboration in 1. individual effort in 3 is largely ignorant of 1 but most work within 2.
A1: where do we want bamboo to apply on the 1 2 3 continuum?
M1: is this a place where distinction b/w teaching and research matters? teaching won't get the kind of attention of research.
T1: state campuses, teaching will get more attention, resources.
P1: teaching draws on these infrastructures (image collections). should get more attention than it does.
crosstalk re: one off/not one off projects - web site building for teaching
P1: reinventing the wheel is not a good metaphor. novel, government constantly reinvented; let us help you build your own wheel.
if we give you the tools to build your one-off web site, we want the ability to harvest the metadata to make your project easier to preserve.
A1: funder has said service oriented arch. is useful way to go about things. "how would SOA fit in with your discipline?" there is tool-building involved, but tools come under force multipler effect.
S1: with shared tech available to support a science, what are the gaps for supporting humanities?
T1: we assume the sciences have this sorted out, but for example geospatial computing is having format war, still. people ask of bamboo, 'how many vendors there and what are they trying to push on people?' you can understand where the funder is coming from because they see projects reinventing same infrastruc.
A1: possible role for bamboo: for a project to be included, might have to pass a certain test for metadata and structure, so that work could be gotten back out later after being admitted to the canon.
MERLOT.org: repository of learning objects created and reviewed by peers.
P1: if i'm interested in a project, i would like to know what exists. ex. biographical databases.
N1: the way these projects get done there's a whole series of stakeholders. not just faculty, librarieans; IT are also stakeholders, not servants. their job needs to be interesting, and building something new is important part of that. how can we satisfy all the stakeholders?
P1: number of faculty, IT people interested will be small.
N1: IT mnay be the majority, they have a stake.
P1: grant that; will librarians cede online repositories to bamboo? text data is in library, online stuff not in library by and large. lots of it is outside. should there be a national consortium involved in maintaining repos? ought bamboo to be helping librarians figure out how to maintain these locally?
A1: at UofC, library is involved in maintaining these (ARTFL)
W1: what is the promise of NINES? who supports?
P1: funded by mellon,
M1: NINES was to answer the peer review problem for 19th c. literature projects; was to become a publishing platform, then it constructed tools
P1: would regard NINES as example of collection building, infrastruc. not necessarily research product thought its tools were.
maybe library would acquire those tools
P1: who will be responsible for all ways all these media will be joined and made accessible? library might have different baskets for jpegs, tiffs, etc. library can preserve the object, who indexes for access?
A1: in SOA that would be done ona case by case basis, library not responsible
N1: isn't that the problem? if goal is for re-use, maintaining at faculty level...
A1: objects maintained in library or such, mashups would be handled by consortia.
P2: can db access be done in a standardized way? valley of the shadow: 1990s presentation, not interactive particularly, outdated, could be done better now. images etc. independent of indexes. would be nice if humanists had a space where all thhat was preserved, could be revisited, rebuild, recreated. amazon S3 model: content-free infrastructure that humanists could fill with content.
N1: there needs to be a social aspect. in order to justify building a project around data, need social mechanisms of assigning credit, need to ensure that data will be there, that interfaces will be stable
P2: collectively that's the value bamboo can bring.
N1: sure but it's a social function not tech, but that's easy
P2: it's not that easy. example of data served from under desks.
P1: under the desk, legal rights are clear. financial responsibility: 2500 dollars/year to house a body of data?
P2: harvard has made computing a core responsibility in the sciences not necessarily in hum. can bamboo help there?
A1: we decided not to apply for NEH grant for HPC because we can do the work on an 8-core MAC. what about lots more documents? well we don't have that many and we can't share due to licensing. couldn't find application for HPC.
N1: another stakeholder is grant institutions. bamboo might be able to provide some long-term stability for funding. why did govt fund HPC? hum. doesn't design nukes. govt. won't fund hum.
P2: need for HPC is rare in hum., can do a lot with disk space and apache.
N1: bamboo could be a clearinghouse for funding opportunities.
P1: practically, what could we ask of bamboo at this moment? figure out how social infrastructure could work?
W1: big question but small part of it: somebody (libraries) should provide a service that allows lowering of threshold bar for making repositories(huh?)
T1: example of access to nyu lib. architecture buld in abstract ways. builf for afghan libraries, now looking at using for collections of images.
A1: successful consortial efforts: JSTOR, ArtSTOR, interface not what i'd want but saves our library a lot of pain. can these be seen as analogs at a different scale for collaboration facilitation?
P1: jstor, artstor... what other stors?
A1: there's not a lot of humanistic methodology embedded in jstor. a title is a title.
mining a theme: what does it mean to be a "light reader"?
P1: distinction b/w physical and digital: access, search, filtering all change. light reading in digital context introduces a whole set of new possibilities.
A1: example of google books
P1: distinction b/w physical and digital: access, search, filtering all change. light reading in digital context introduces a whole set of new possibilities.
A1: example of google books
N1: marginal notes, highlighting, post-it notes; you can't browse a database. you can walk around a library, spot new things easily. db doesn't lend itself well to that.
P2: i browse amazon
N1: i surf amazon. you have to search.
P2: aren't there instances where the library classification scheme surprises you?
T1: taking a set of articles and looking only at the first and last paragraphs. abstracts.
scanning for main points
N1: mining footnotes, you can flip through a book looking at the bottom inch
A1: i can skim a film, can't do that with the first couple pages on amazon
A1: OCR loses the structure of the book. especially important in journals.
N1: clickable annotation, 'pentags' is slow
P1: taking notes
A1: speed is an issue in light reading. waiting is bad.
P1: this is all stuff we do with text i a physical env., how change in digital?
Description Resistant Material - what is involved? What are the problems and issues?
J1: Technology not there to look at a film and tell you what it's about. Only scholars can turn nuance into meaning. I'm a technologist - I look at the meaning online and implement it. Coming to terms w/ metadata and people's interpretation of what it means.
T1: Extended metadata - what is that? What do people need to know about these materials? Key decision. Hard to go back and change that decision. Key moment to define your ontologies.
S1: Creating analogies is a first step.
G1: Describe unfamiliar in terms of the familiar. Must know field you're speaking to.
A1: Process of writing in unconscious, but you are imagining who the audience is. Framing it for that.
T1: Digital does a bad job of doing that; think of themselves as the audience.
A1: At some stage you are the audience; has to be an oscillation between the states of self referential and public state.
C1: A dialogic model.
A1: Look at a painting - look at the strokes then move back to look at the whole.
T1: Shifts in balance toward other people.
G1: To what extent do we choose an audience? Done workshops for graduate students and faculty on how to get published. But History projects - catalogues of baseball cards, blueprints, and posters.
T1: Doesn't really serve scholars' aims all that well.
G1: Sifting has to be with users and use behind it. Good at some cataloging and not others.
R1: Can you walk us thru something you've done?
K1: Rock art - 40,000 sq km of unexplored rugged terrain, many rock paintings everywhere, most of which never seen for at least half a century. Never enough funding to systematically record. 90% of opportunity is by happenstance. Need to quickly and serendipitously come up w/ a way to record this. Need consistency in recording, with a system that is flexible, but to give sufficient consistency so that records are valuable. Used a chief man in this area, a scholar of rock art. Had never worked w/ this in a digital form. Either selecting or devising schema for metadata. Always problematic, never possible to come up w/ something - always at the exclusion of something else. More development needed. Schema came from written records from chief, but his was a singular interpretation and has proven incorrect. Also needed land forms, vegetation, and rock art conservation. Had to embrace the perceptions and conceptions of the local Aborigines in the area, frequently entirely different from what we were seeing. Concepts as an English-speaking person were entirely open to misinterpretation once you tried to translate into these dialects. A notion like watershed - translating that as a concept was very difficult.
T1: Seeing analogies - public historians use terms "stakeholders" and maybe it's defining stakeholders needs. Aborigines probably not users, but are stakeholders in this project.
K1: We have to imagine them as potential stakeholders.
T1: Different than writing a journal article for subscribers to that journal and their students.
C1: Creating the discourse, coming up w/ concept map, the set of terms, the mapping of the way concepts get explained. Reason they are resistant is because we don't know how to describe. Have to start imperfectly and build on that.
G1: Ambiguity came up a lot. Sciences - goals are discovery, description, solution, there is a resolution. In English, it's ambiguity - don't want to "solve" Shakespeare. Closest thing to description of a project is the methodology involved. Not about solution.
K1: Take into process the potential that you might have to change things. Don't get it right the first time.
T1: Machines don't like ambiguity but we revel in it. Key problem.
C1: The way we conceive what the computer is capable of. Much as I like grammar and tree banks it founders on ambiguity whereas stochastic and statistical models do not.
A1: two directions here - can we get machines to do the work for us? 2nd is can we use machines to help us do this? Given the existing and near technologies, how can we use to help us now?
R1: My inclination is to come back to: how do you choose an audience? Can we find something to help us choose an audience?
S1: Can we get an audience to come to us?
T1: One thing uncommon is tool building. Here we are talking about data driven projects. There we do have an open source model - audience contributes code. Maybe there is an analogy from open source.
R1: What is significance of audience we are trying to catch?
G1: Intentional abstraction. When it gets repurposed, a lot can be repurposed, when they move outside of that what happens? We have the tools now to do annotation after the data is out there so people can choose the data. Don't know what that requires of the data. Machines are literal-minded, but now you can search for shapes on Blobworld. "Dogs must be carried on the escalator" can mean two things.
C1: Point at which it's describable we know what to do. These clumps of material come before the audience. The audience is that researcher trying to come up w/ the concepts. At certain point it may be ready for exposure. Getting a handle on what those tools look like.
K1: Photography, in a practical sense. GPS. An assessment of what's there in terms of its imagery. The item itself - you might find a rock shelter w/ 5 or 50 paintings, maybe layered. See much better with the eye. Working out what's there. A lot of guesswork at that point. Slipperyness of writing down descriptions.
R1: Where does validation come in?
K1: If it's important, if we want to come back. There might be a sense of ritual reasons. A value judgment is made here.
T1: In English and History, more cataloging is done.
S1: In terms of audience, I worked w/ engineers who instrumented (made jet engines) because you couldn't anticipate what was going to go on. It was brainstorming, anything that could go wrong. Blackboxes - you record what pilots say to know what's wrong w/ the plane, but not that people would want to hear those conversations. You collect more data than you could justify - not sure why it doesn't apply here.
C1: Hermeutic circle - susceptible w/ technology. We have certain tools that give us what we want to do but there's stuff that doesn't fit into those categories. Easy to get seduced into gathering data, instrumenting things because we can.
R1: Sifting, elaborating, annealing. How do you sift a piece of art?
T1: Done twice, on the level of the item and then on a larger scale. What kind of data are we interested in? Sifting happens over and over again.
C1: This painting is an excellent example of "blah" but it's different. We come up w/ an imperfect description, this is interpretation. Something is different, unique in here. Always reassess. Grinding it down, refining it, elaborating. That's where the audience comes in.
T1: That's an ideal process, but not sure that always occurs. Start w/ Dublin core, add some fields, and use it. Find out later it doesn't work, but you don't fix it, that's what we've got. Sifting happens too quickly.
C1: Groping toward tools, the sifter. Shouldn't be so much trouble to go back.
K1: Major shift in last 5 yrs in these paintings. Whole body of contact rock art - depictions of guns, dresses, hats, pipes. Huge interest now in this, so there's a resifting that's going on.
J1: Tools and services that can grow and resift. Planned growth.
R1: A built-in assumption that it's an imperfect description.
S1: Need checks to make sure you haven't left something out.
C1: Automatic classification of the '70s. How do you let a machine help you define something?
S1: Ambiguity could be developmental---evolution.
R1: Begs the question do we need to annotate what our assumptions are?
T1: Also something to help anticipate future reuses of the product.
C2: Is it part of a refinement process? Reclassifying and moving forward. What boundaries drawn around these objects? Who has access to refining?
A1: Idea or need for a particular tool, but I am illiterate. Someone who can hear or understand my needs.
T1: We have a developers' marketplace to connect w/ nonprogramming users. Db projects tend to look to conventional scholarly and library practices. Would be useful for a lot of them to look toward computer science practices for these projects. Problems you run into are ones programmers have been dealing with, version control, or iterative programming. A way to connect ourselves to those kinds of practices.
C1: How much are those categories from that environment? (Software engineering or project mgmt practices)
T1: Bamboo could provide manuals to do this.
C1: Is there a pedagogy, a defined set of classes, to create humanities computer majors.
G1: Georgia Tech is addressing this w/ tracks - one is close to this.
R1: Seems like the tools are there but the money isn't.
C1: Identifying the stakeholders, or the ability to have a network of those.
G1: In Wealth of Networks, chapter about people who live to write a chapter for Amazon, or debug Wiki, for nothing. They have some stake in it.
G1: To define communities evolving, you can define gates into communities. Not just the sifting, but a crumb trail to track process of research. That's also called monitoring. May have to happen. Gating: Slashdot, Salon, you develop credentials by saying cool things about cool things. Ways of allowing technology to evaluate participation in the process.
C2: Don't you think that will naturally happen across the board in research?
G1: Surprised by how little happens there.
C2: Research platforms exist being put to use in other domains.
T1: Gating - Opensource software development. Mozilla has developed elaborate gating mechanisms, who gets to contribute to code. You submit a patch, you get "street cred." There are hierarchies.
C2: Like game theory, you collect badges.
T1: Code will fork if people get unhappy. That's how Firefox was created.
C2: SVN - subversion is code repository - teams come into contribute. Protects the project.
C1: Wiki created version control w/ text.
T1: People using subversion to write their manuscripts. Compile the code at the end to get the manuscript. Those practices are widespread and highly developed.
C1: Absent tenure, it is peer review.
T1: Not linear, happening concurrently.
C2: Collective sense making.
General: all humanities scholarly outputs are potentially readable by someone in a general audience, at some point
How will other systems access data, objects, metadata? OAI, Google, ORE
Dissemination : conference presentations, journal articles, books
Can be differences by disciplines
But scholarly publication is giving way to online work, so presentation tools matter (Dobin)
Tools collapse, data and presentation (Dobin)
Informal dissemination (blogs, podcasts, etc)
Case of very expensive book published online, but inaccessible
Audience as participation, co-author
IP issues: loss of control, CC
Expanded youth audience
Major worry about control: Vinopol
So, a Bamboo platform needs levels
Importance of exit forms for wikis and blogs
So roles, too, for people in this platform
Unsworth: build IRs that point copies, publications towards other platforms
Discussion about defining a repository
Users decide about levels of
-lifecycle, timeline differences between technology timeline and researchers' chronologies
-metadata story about people adding terms offensively (not deliberately)