Skip to end of metadata
Go to start of metadata

When: Thursday, March 24 from noon - 1pm
Where: 200C Warren Hall, 2195 Hearst St (see building access instructions on parent page).
Event format: The reading group is a brown bag lunch (bring your own). This session will be an open discussion based on the reading (no presentation).


Facilitator: TBD

====

A two-day Knowledge Exchange workshop in Berlin in October 2015 has resulted in the recent publication of the report: Research Software Sustainabilitiy: Report on Knowledge Exchange workshop (PDF). We will discuss the findings and recommendations in this report, comparing them to experience that Berkeley staff and researchers have had and are having in research-oriented software projects, from the domain-specific to broad consortially-supported efforts.

There will not be a presentation at this meeting. Please review the report (linked again below) prior to the meeting:

We'll proceed with the discussion assuming that participants have familiarized themselves with the report.

 

Attending:Patrick, Chris, Dav Clark, Jason Christopher, Steven Carrier, Chris Paciorek, Larry Conrad, Aron Roberts, Camille Villa, Barbara Gilson (SAIT). 

 

 

Patrick's prepared questions:

• SW is not data, and preservation will provide a kind of reproducibility, but not re-use and not a reproducibility that is relevant to a changing landscape of dependent factors. When do we need preservation, and when do we need actual sustained SW that is kept current and usable/relevant?

 Docker encapsulation of Climate codes, to insulate very brittle SW from changing substrates and libraries. Band-aid or not?

• Given key recommendations (p11), what role do various groups on campus have to achieve those recommendations?

• What about the societal barriers?  Can we address any of them?

• Lack of awareness of SW role in research

• Lack of citing

• Lack of understanding of licensing and ownership (caution about reuse)

• Lack of skills/training (especially as SW gets more complex).

• Lack of career paths for SW experts (cf. climate SW and lack of any incentives to rewrite it).

• Gender balance, diversity issues (in SW versus research broadly). How does this matter?

• Will SW Mgmnt plans help? 

• Should someone create an SMP tool like the DMP tool?

 

Links from Aaron Culich re: activity in this space that might be helpful:

 

https://github.com/softwaresaved

https://github.com/softwaresaved/software-management-plans

http://software-carpentry.org/blog/2015/12/ssi-funding.html

http://www.software.ac.uk/

http://wssspe.researchcomputing.org.uk/

https://datascience.nih.gov/bd2k/about/working-groups

http://www.softwarediscoveryindex.org/

 

 

 

 

Patrick introduces the report.

 

A number of campus discussions about reproducibility. US effort WSSSPE (Working towards Sustainable Software for Science: Practice and Experiences: http://wssspe.researchcomputing.org.uk/). Several of us have been involved...Aaron Culich, Dav, Chris Paciorek, ...

Patrick: What do others think of the issues brought up, and the recommendations? Software ..., citation, ...  (Domain expertise, computer expertise)

Dav: Chris P - is PAtrick desribing your role in stats?

Chris - to some degree. But I have a fairly unique position.

Chris H: Programmers in Research IT have gone back to work for museums. Gives two cases. Not clear after grant is over what to do with software, programmers. Units came to IST to support software, programmers, duties. No separation between the various concerns. Having internal websites not be on production systems. Didn't happen until there were security issues.

Steven: My understanding is that people don't just hand off software. I have been responsible for one piece of software that I finally turned off due to security issues. No credit to sys admins who take care, show concern, in this area.

Chris H: Couldn't demonstrate to units that we could provide software in a high-quality way that met the domain needs. 

Patrick: We were unique among R1 institutions in having a central group supporting museums. Knowledge Exchange proposed plan for Software Management Plans, which would demonstrate fundamental recognition by funders of importance of software. Does anyone see anything like that happening? Mellon reusability workshops 6-7 years ago. Exact same discussion. Funders would need to put incentives in place to re-use software. Instead, funders generously fund innovation.

Chris P: If researchers can find existing software, they're happy to use it. 

Patrick: If turnkey,  yes. If it requires integration, ongoing maintenance, no.

Chris H: to Dav - is this different with advent of GitHub?

Dav: Patrick is dead on. Though some people are happy to work outside of funding, put work on GitHub. New model developing there. Nobody making it to tenure track.

Rick: Fernando/Jupyter Notebook as model (but not success story)?

Dav: Unicorn. Mostly built by grad students.

Patrick: Fernando couldn't get a faculty appointment on campus.

Patrick: Climate researchers. Brittle software, written in Fortran (?), breaks on software upgrades. Terrified of upgrades and patches that keep system secure. Could stop research in its tracks. No one in the domain has interest in updating the software. It would take them away from work that would advance them. Funders not willing to fund that work.

Chris P: You would think that DOE and others would be interested.

Patrick: But they're not. Giant problem.

Chris H: With emulation environments, does the problem of security diminish?

Patrick: Software is not data. Software preservation is different from data preservation. If you don't maintain it, you can't use it with modern tools. If you can containerize it to avoid security holes (non-trivial), you still haven't made it work with new softwares, etc. What have you accomplished?

Chris H: In re-writing, how can you prove that you haven't changed its outputs [vis-a-vis] existing research results.

Dav: Software Carpentry, BIDS, aim to train researchers in computational skills. That's happening. Slowly. A model to career path?

Patrick: Big difference between training domain experts to write minimal functions and having real IT support. Large Hadron Collider (?) is the rare project that included full IT support.

Chris P: Tenure argument is a red-herring.

Patrick: 20 years ago, books. Credit given to scholar. Today, databases. Not cited, not citable. Change in recognition of scholarship.

Two questions: Should you get tenure? Should there be clear career path? Big difference between developing a production-ready tool that impacts scholarship, and maintaining a production system.

Back to climate models, do we just have to let the generational change occur? Allow the code to die before new code can be written?

Dav: Climate example is non-standard. Real world consequences of being inefficient. Big projects institutionalized. Less subject to turn-over. Neuroscience tools are similar to climate model. But new approaches are beginning to replace old ones. Fernando came to campus to work in Brain Imaging Cneter. One of the early tools: a simple i/o tool that allowed Python to read brain image data. As these type of tools propagate, it becomes easier for grad students to wire them together. Similarly, Ariel Rochem (Sp?), time-series code. Cutting edge analyses.

Patrick: So is data analysis replacing [coding]?

Dav: In PDE [Partial Differential Equations], there is work toward full-blown tools with nice interfaces.

Patrick: Researchers say that the model is separate from their research. They just need the models to remain stable so they can compare their results against earlier ones.

Steven: Sounds like bad science to me.

Patrick: Researchers are not un-concerned. Rather, they are unable to change the model, so they focus elsewehre.

Chris P: Human condition.

Chris H: Steven, what's the condition in your department?

Steven: There is some old software written by a past PI. Statistical analysis done on item response (questionnaire answers). Politically (face-saving, protecting feelings), important to come up with same answer as old software did.

Chris P: My sense in econ is that students are using existing software.

Patrick [reviews key findings in report]: What could people on campus do to advance these recommendations, practically (aside from lobbying funders)?

Chris P: Some items are being addressed by the orgs that Dav mentioned (Software Carpentry, etc.)

Dav: Work of building a standard practice among consultants. BIDS is a good choice of locale for that.

Aiming at grad students and post-docs. 

Patrick: What about proposals 1 & 2?

Chris H: Article pointed out rise of RDM, some history there to model software awareness on. RDM can ask what are you doing about your software? A lot of times, people don't think about that. A data management plan raises data questions, even if only minimally filled-out.

Patrick: Clear evolution of understanding, clarity re: role of data. Now, it must be provided, shared, able to be re-used.

Steven: I wonder if a sustainability plan might be that when you lose interest or shut it down.

Patrick: Recognize life-cycle of software.

Steven: Or make case that software will be preserved if its valuable.

Patrick: Brutally difficult to make claim about preserving software in funding requests.

Rick: How do Knowledge Exchange proposals relate to research done on proprietary software (thinking of Philip Stark talking about problems w Excel; SPSS, SAS, etc.)? Aren't these proposals implicating big software firms, perhaps in a bad light? 

Patrick: Difficult business model in supporting software with small market, eg. museum software.

Chris H: CSpace, ArchiveSpace, Lyrasis.

Chris P: Back to what we can do: Maybe mandate of consultants should be widened to include software.

Patrick: Dream of providing software architecture support [for sustainable design/tools].

Chris P: Launch Research Software Management program?

[1 o'clock comes and goes. Discussion continues.]

Dav: Scalability, portability, sustainability - marketing problem to raise awareness of these difficult things.

Chris P: Write one-offs. Don't know which I'll ever use again.

Patrick: Important to focus on software that is proven to be needed.

Jason: Discussion at Goldman re: research arc. Eye-opening to PI.  

Pa7trick's prepared questions:
• SW is not data, and preservation will provide a kind of reproducibility, but not re-use and not a reproducibility that is relevant to a changing landscape of dependent factors. When do we need preservation, and when do we need actual sustained SW that is kept current and usable/relevant?
Docker encapsulation of Climate codes, to insulate very brittle SW from changing substrates and libraries. Band-aid or not?
• Given key recommendations (p11), what role do various groups on campus have to achieve those recommendations?
• What about the societal barriers?  Can we address any of them?
• Lack of awareness of SW role in research
• Lack of citing
• Lack of understanding of licensing and ownership (caution about reuse)
• Lack of skills/training (especially as SW gets more complex).
• Lack of career paths for SW experts (cf. climate SW and lack of any incentives to rewrite it).
• Gender balance, diversity issues (in SW versus research broadly). How does this matter?
• Will SW Mgmnt plans help? 
• Should someone create an SMP tool like the DMP tool?

Links from Aaron Culich re: activity in this space that might be helpful:

https://github.com/softwaresaved
https://github.com/softwaresaved/software-management-plans
http://software-carpentry.org/blog/2015/12/ssi-funding.html
http://www.software.ac.uk/
http://wssspe.researchcomputing.org.uk/
https://datascience.nih.gov/bd2k/about/working-groups
http://www.softwarediscoveryindex.org/

 

Attending:Patrick, Chris, Dav Clark, Jason Christopher, Steven Carrier, Chris Paciorek, Larry Conrad, Aron Roberts, Camille Villa, Barbara Gilson (SAIT). 

Patrick introduces the report.

A number of campus discussions about reproducibility. US effort WSSSPE (Working towards Sustainable Software for Science: Practice and Experiences: http://wssspe.researchcomputing.org.uk/). Several of us have been involved...Aaron Culich, Dav, Chris Paciorek, ...

Patrick: What do others think of the issues brought up, and the recommendations? Software ..., citation, ...  (Domain expertise, computer expertise)

Dav: Chris P - is PAtrick desribing your role in stats?
Chris - to some degree. But I have a fairly unique position.

Chris H: Programmers in Research IT have gone back to work for museums. Gives two cases. Not clear after grant is over what to do with software, programmers. Units came to IST to support software, programmers, duties. No separation between the various concerns. Having internal websites not be on production systems. Didn't happen until there were security issues.

Steven: My understanding is that people don't just hand off software. I have been responsible for one piece of software that I finally turned off due to security issues. No credit to sys admins who take care, show concern, in this area.

Chris H: Couldn't demonstrate to units that we could provide software in a high-quality way that met the domain needs. 

Patrick: We were unique among R1 institutions in having a central group supporting museums. Knowledge Exchange proposed plan for Software Management Plans, which would demonstrate fundamental recognition by funders of importance of software. Does anyone see anything like that happening? Mellon reusability workshops 6-7 years ago. Exact same discussion. Funders would need to put incentives in place to re-use software. Instead, funders generously fund innovation.

Chris P: If researchers can find existing software, they're happy to use it. 
Patrick: If turnkey,  yes. If it requires integration, ongoing maintenance, no.

Chris H: to Dav - is this different with advent of GitHub?

Dav: Patrick is dead on. Though some people are happy to work outside of funding, put work on GitHub. New model developing there. Nobody making it to tenure track.

Rick: Fernando/Jupyter Notebook as model (but not success story)?

Dav: Unicorn. Mostly built by grad students.

Patrick: Fernando couldn't get a faculty appointment on campus.

Patrick: Climate researchers. Brittle software, written in Fortran (?), breaks on software upgrades. Terrified of upgrades and patches that keep system secure. Could stop research in its tracks. No one in the domain has interest in updating the software. It would take them away from work that would advance them. Funders not willing to fund that work.

Chris P: You would think that DOE and others would be interested.

Patrick: But they're not. Giant problem.

Chris H: With emulation environments, does the problem of security diminish?

Patrick: Software is not data. Software preservation is different from data preservation. If you don't maintain it, you can't use it with modern tools. If you can containerize it to avoid security holes (non-trivial), you still haven't made it work with new softwares, etc. What have you accomplished?

Chris H: In re-writing, how can you prove that you haven't changed its outputs [vis-a-vis] existing research results.

Dav: Software Carpentry, BIDS, aim to train researchers in computational skills. That's happening. Slowly. A model to career path?

Patrick: Big difference between training domain experts to write minimal functions and having real IT support. Large Hadron Collider (?) is the rare project that included full IT support.

Chris P: Tenure argument is a red-herring.

Patrick: 20 years ago, books. Credit given to scholar. Today, databases. Not cited, not citable. Change in recognition of scholarship.

Two questions: Should you get tenure? Should there be clear career path? Big difference between developing a production-ready tool that impacts scholarship, and maintaining a production system.

Back to climate models, do we just have to let the generational change occur? Allow the code to die before new code can be written?

Dav: Climate example is non-standard. Real world consequences of being inefficient. Big projects institutionalized. Less subject to turn-over. Neuroscience tools are similar to climate model. But new approaches are beginning to replace old ones. Fernando came to campus to work in Brain Imaging Cneter. One of the early tools: a simple i/o tool that allowed Python to read brain image data. As these type of tools propagate, it becomes easier for grad students to wire them together. Similarly, Ariel Rochem (Sp?), time-series code. Cutting edge analyses.

Patrick: So is data analysis replacing [coding]?

Dav: In PDE [Partial Differential Equations], there is work toward full-blown tools with nice interfaces.

Patrick: Researchers say that the model is separate from their research. They just need the models to remain stable so they can compare their results against earlier ones.

Steven: Sounds like bad science to me.

Patrick: Researchers are not un-concerned. Rather, they are unable to change the model, so they focus elsewehre.

Chris P: Human condition.

Chris H: Steven, what's the condition in your department?

Steven: There is some old software written by a past PI. Statistical analysis done on item response (questionnaire answers). Politically (face-saving, protecting feelings), important to come up with same answer as old software did.

Chris P: My sense in econ is that students are using existing software.

Patrick [reviews key findings in report]: What could people on campus do to advance these recommendations, practically (aside from lobbying funders)?

Chris P: Some items are being addressed by the orgs that Dav mentioned (Software Carpentry, etc.)

Dav: Work of building a standard practice among consultants. BIDS is a good choice of locale for that.
Aiming at grad students and post-docs. 

Patrick: What about proposals 1 & 2?

Chris H: Article pointed out rise of RDM, some history there to model software awareness on. RDM can ask what are you doing about your software? A lot of times, people don't think about that. A data management plan raises data questions, even if only minimally filled-out.

Patrick: Clear evolution of understanding, clarity re: role of data. Now, it must be provided, shared, able to be re-used.

Steven: I wonder if a sustainability plan might be that when you lose interest or shut it down.

Patrick: Recognize life-cycle of software.

Steven: Or make case that software will be preserved if its valuable.

Patrick: Brutally difficult to make claim about preserving software in funding requests.

Rick: How do Knowledge Exchange proposals relate to research done on proprietary software (thinking of Philip Stark talking about problems w Excel; SPSS, SAS, etc.)? Aren't these proposals implicating big software firms, perhaps in a bad light? 

Patrick: Difficult business model in supporting software with small market, eg. museum software.

Chris H: CSpace, ArchiveSpace, Lyrasis.

Chris P: Back to what we can do: Maybe mandate of consultants should be widened to include software.

Patrick: Dream of providing software architecture support [for sustainable design/tools].

Chris P: Launch Research Software Management program?

[1 o'clock comes and goes. Discussion continues.]

Dav: Scalability, portability, sustainability - marketing problem to raise awareness of these difficult things.

Chris P: Write one-offs. Don't know which I'll ever use again.

Patrick: Important to focus on software that is proven to be needed.

Jason: Discussion at Goldman re: research arc. Eye-opening to PI.  
  • No labels