This week we'll discuss Electronic Lab Notebooks. We'll explore trends in ELNs, and associated issues; and discuss how ELNs and the researchers who use them relate to (or require) IT infrastructure for research computing, data management, and other resources or services that Research IT and/or other IT providers on the campus support.
Facilitator: Patrick Schmitz, Associate Director, Research IT
Background material for review prior to our 8/27 meeting:
1. Research tools: Jump off the page is a good discussion of why ELNs are interesting and some recent developments. References UCB's Carl Boettiger, as well as Carly Strasser (formerly of CDL, now with the Gordon and Betty Moore Foundation), et al.
2. U. of Utah ELN guide has some useful intro material, comparison of commercial solutions, etc.
3. Evernote for scientists article about use in Astronomy takes a more open and ad-hoc approach.
4. An argument FOR electronic lab notebooks includes an interesting twist and discussion on risk and management of ELN data. Worth reading at least the first few paragraphs of the linked “How My Mom got Hacked” news item for context.
Additional resources (optional reading):
5. Discipline-specific Tools: Science: Electronic Lab Notebooks (ELNs) links to some good resources.
6. Scientists Embrace Openness addresses the virtues of openness.
7. Project Jupyter: Computational Narratives as the Engine of Collaborative Data Science grant proposal (written by LBNL's and BIDS Senior Fellow Fernando Perez; and Brian E. Granger of Cal Poly) includes a discussion of the benefits of open notebooks, and Computational Narratives.
8. eCAT: Online electronic lab notebook for scientific research review of one commercial solution.
David Baxter, Jepson Herbarium
Steve Carrier, School of Education
Jason Christopher, Research IT
Aaron Culich, Research IT
Quinn Dombrowski, Research IT
Rick Jaffe, Research IT
John Kratz, CDL
Susan Lin, Linguistics
John Lowe, Research IT
Steve Masover, Research IT
James McCarthy, SSL
Becky Miller, Library
Scott Peterson, Library
Aron Roberts, Research IT
Patrick Schmitz, Research IT
Stephanie Simms, CDL
Elliot Smith, Library
Ronald Sprouse, Linguistics
Camille Villa, Research IT
Perry Willett, CDL
Jamie Wittenberg, Research IT & Library
Patrick: Carly Strasser’s comments on what it meant to capture all these different things, archive them, and manage them as data. Other articles discussed almost capturing everything, but not well integrated into the other data. You notice things that sound encouraging in terms of people using tools, but I didn’t see the extent to which there are best practices emerging. Could you tell a student: these are a good way to use these tools? I’m interested in workflows and best practices. The Utah resources article struck me for their focus on documenting IP for the purpose of patenting. This can be antithetical to other tools driven by the lens of open science. The fourth article opens up the possibility of your information being held hostage by hackers. Another issue: will I be able to get my data out of the app in 20 years? Will I need to work with my data very carefully, proactively to migrate it forward?
Stephanie: I didn’t see a lot of specific use cases in the article. I’m coming from archaeology, a lot of individual field note taking that you want to draw back to an international project. Some of the best practices are going to be very specific to the domain. These paper notebooks are also part of the larger problem of data management problem. Metadata is a term unfamiliar to researchers; I didn’t think of that until I joined CDL.
Ron: We’re not using lab notebooks right now.
Susan: Within the field, there’s a lot of sub-disciplines. Everyone’s a special snowflake. I’m a phonetician, we do a lot of lab work and field work. There are also sub-disciplines that don’t really need it. A lot of the work they do is theoretical. Maybe a paper notebook, a series of word documents, some electronic notebooks.
Ron: we do have some folks with old word docs that aren’t open after 20 years.
Patrick: is there anyone who’s playing with this environment where you combine notes and computation?
Ron: so far, I’m the only one in the department working on it. At best we have some lab documentation on a wiki that describes some methods.
John: Raymond Yee and I used IPython notebooks for documenting the Solr APIs for the FSM and Hack the Hearst hackathons. It was really helpful.
Patrick: when you were doing it, did you just think of it as API documentation, or as the basis for some notebook workflow? Like something that would help them capture their process and they built their app?
John: you’d really only use the notebook for illustrating bits of code that you would add to your app later
Patrick: True, the hackathon process doesn’t involve a lot of notetaking
John: if you’re trying to get outputs from an API, a notebook is a great way to capture, share, and maintain your research. but there are so many other applications you’ll have to write code for, where that won’t help. If you’re trying to build a web app, or a product...it’s not built for that. You’d work in an IDE or something.
Jamie: I think there’s a tool for lab notebooks as ancillary tools that work alongside executable workflows.
JISC-Oxford collab, myexperiment.org — several thousand users. It was very difficult to archive those projects. Combinations of pdfs, experimental workflows, trying to package those up and map the data was a big challenge to the archives.
Patrick: why aren’t people adopting these tools? do they not want to be regimented?
Jason: what’s the value proposition of these tools? That changes from experimental to observational sciences, perhaps. Is that strong enough to pull me out of my comfort realm?
Patrick: the value proposition is around things that are hard to measure, or things that are uncertain. Your lab notebooks “might” get lost, etc.
Aaron: there aren’t a lot of people talking about the downsides of being the first person to adopt. If you get into a jam, who’s going to help you? BIDS has the Hacker Within, a community around the tools people are trying to adopt. So you can help yourself, but there’s a barrier between you and them if they’re not willing to adopt.
Stephen: What about export function?
Patrick: Being able to export is like insurance. It makes you feel good and safe. But when you actually try to get the data out, it’s another issue entirely. We recently tried moving things out of Google Code to JIRA…
Aaron: and then once you export data, you have to export the culture into the new tool too.
Patrick: for some, I think there’s a reasoned resistance. Maybe the argument is really unconvincing for folks who aren’t into tools.
John: the drumbeat of “everyone must learn to code” is in this. Everyone has to learn how to use these wonderful tools to do their research. You can see how folks might feel resistant.
Rick: What is the intellectual property status of these notebooks (print vs. electronic) — before they become end-of-career archives?
Stephanie: some labs require you to leave your lab notebook at the site
Elliot: I haven’t been asked about these tools by practitioners. I haven’t made any recommendations. For incoming molecular cell biology students this year, I gave them a list of electronic lab notebooks that are either free or free for a trial period. I didn’t recommend or discourage the use of any of them. The departments I support haven’t asked about them.
Patrick: it seems to vary by domain. Searchability, discoverability, indexing is something we’ve grown used to in email and desktop search. Why is that not a necessary condition for all research environments? Maybe it’s the work environment...historians using Zotero as a “lab notebook”.
Quinn: in some disciplines, people don’t necessarily think about workflows and methods...
Patrick: is email just adopted because it’s provided centrally?
Steve: I wonder if some people see a proliferation of features as a problem, and not convenient. Email has very little structure. A notebook has very little structure. Evernote is really simple. IPython and something that deals with regulations or protects human subjects starts to have a great deal of structure around it, which some people might feel as constraint.
Susan: tools that are universally adopted...email is something that is used every single day. Perhaps the reasons some disciplines are adopting this...they’re in the lab every day, and they need those notes. If I’m in the lab, if I’m lucky, I’m there 2 days a week tops. Sometimes I just need to remember what happened and record afterwards.
Patrick: we do a lot of collaborative editing in Google Docs in our group. Not the best formatting, but the simultaneous collaboration makes the deal.
Jamie: I don’t think these electronic lab notebooks aren’t just providing a digital analog.
LabArchives ELN “Chance favors the organized lab” — supposedly this service is also data compliant? Do they have to modify the notebook? Does it have different models for different funding agencies? As someone who works in RDM I’m skeptical about them staying current.
Susan: I use text edit documents for a lot of my files that I want to preserve. Even the advent of Unicode has changed my work environment and my archives. For example, earlier I had to write a lot in Thai and the encoding for that has changed.
Quinn: Matt Kirschenbaum recently published a book on the history of word processing