The National Science Foundation (NSF) has issued a “Dear Colleague Letter”, Requesting Information on Future Needs for Advanced Cyberinfrastructure to Support Science and Engineering Research (see call here). They are looking for input that will “inform the Foundation's strategy and plans for an advanced cyberinfrastructure that will enable the frontiers of science and engineering to continue to advance over the next decade and beyond.”
Research IT and CITRIS/PRP invite members of the campus who want to have input to this to a Reading Group to identify key themes and perspectives. We will draw on this input to prepare at least one response to the NSF letter (others can submit their own responses as well).
At the reading group, we will discuss the three central questions in the Letter:
Please come with ideas, suggestions, issues to share and discuss!
When: Thursday, Feb 9, 2017 from 12 - 1pm
Where: 200C Warren Hall, 2195 Hearst St (see building access instructions on parent page).
What: How should NSF support cyberinfrastructure for the next decade?
Facilitating: Patrick Schmitz, Research IT; Camille Crittenden, CITRIS; (others TBA)
Prior to the meeting, please review:
Presenting: Patrick Schmitz, Research IT; Camille Crittenden, CITRIS
Aaron Culich, Research IT
Aron Roberts, Research IT
Barbara Gilson, SAIT
Bill Allison, IST-API and Campus CTO
Chris Hoffman, Research IT
Colin Baker, International Computer Science Institute
David Greenbaum, Research IT
David Trinkle, VCRO
Deb McCaffrey, Research IT
Jason Christopher, Research IT
Jenn Stringer, ETS
John Lowe, Research IT & Linguistics
Maurice Manning, Research IT
Quinn Dombrowski, Research IT
Rick Jaffe, Research IT
Ryan Lovett, SCF
Steve Masover, Research IT
Tom DeFanti, CalIT2 SDSC
??, International Computer Science Institute
Next time (in two weeks): Non-consumptive research over large data sets
Patrick: These letters really do shape the NSF's funding.
From the NSF letter:
* Question 1 (maximum 1200 words) – Research Challenge(s). Describe current or emerging science or engineering research challenge(s), providing context in terms of recent research activities and standing questions in the field.
* Question 2 (maximum 1200 words) – Cyberinfrastructure Needed to Address the Research Challenge(s). Describe any limitations or absence of existing cyberinfrastructure, and/or specific technical advancements in cyberinfrastructure (e.g. advanced computing, data infrastructure, software infrastructure, applications, networking, cybersecurity), that must be addressed to accomplish the identified research challenge(s).
* Question 3 (maximum 1200 words, optional) – Other considerations. Any other relevant aspects, such as organization, process, learning and workforce development, access, and sustainability, that need to be addressed; or any other issues that NSF should consider.
Patrick: As providers of infrastructure and consulting on the Berkeley campus, it would be a good thing to collate our response to this letter. Noteworthy that there will be a discussion at CASC of this letter, and thus a response from CASC.
Camille: CITRIS is convening a group of faculty to discuss and respond to this letter.
David Trinkle: Open to discussion of who ought to send a letter growing out of this discussion (or others) on the UCB campus, if there's some way to lend it additional credibility or weight in the NSF's consideration of the response.
Patrick: Updated CI Plan for our campus might also be an outcome of this discussion.
Ryan Lovett: Can comment on how this response can affect a unit like SCF. In the past we have applied to NSF for equipment funding (nodes, storage, etc.). I don't think SCF should be going after or awarded infrastructure type grants to purchase our own infrastructure. Pie in the sky, faculty should apply for some level of compute and be able to find that in a cloud-agnostic portal. Not sure how NSF would realize this. You could say that part of what I do is a waste: we've been doing the work we do for 20 years now, in significant part because there was no unit constituted like Research IT in past years. I'm not even sure that the infrastructure should be located/maintained on the campus. NSF should think about optimal experience for researchers.
Patrick: How much CI is hardware, and how much is people to help researchers make use of that hardware, wherever it is located.
Ryan: Right. Not feasible to provide the capacity that is demanded by large classes, etc. Role is to facilitate and consult; officially I'm a Unix sys admin, but that's not what I've done for years. Shift in what researchers need is to engage with people. Not all of that is necessarily local, though presumably we know locally what our own researchers need.
Camille/John L/Patrick: workforce development, reskilling, where do we get and train the people we need. A big topic when talking to NSF Program Officers. This could be stated more clearly in the Dear Colleague letter and elsewhere in NSF's plans.
Jenn: Building connections and alliances with organizations on the campus that may not have been part of the CI landscape ten or fifteen years ago. The Library, for example. What is the development piece that incents organizations across the campus to work together.
Tom D: NSF thinks of workforce development as how people are trained in community colleges and 4-year programs. People are the biggest cost in any university endeavor. A poor idea to charge overhead on services -- we should be *encouraging* people to use cloud resources, not discouraging them by charging overhead to their grants.
Maurice: If NSF keeps throwing *only* CPUs and storage at campuses, that's probably a mistake. It would be much more efficient to teach researchers to use existing CI more effectively.
Tom D/Patrick: NSF serves scientists who can soak up any amount of computation they're allowed to. This doesn't always acknowledge (or acknowledge fully or meaningfully) the so-called "long tail" of researchers whose needs are less gargantuan.
David G: Part of context of what's going on in Office of CI at NSF is limits on the amount of money, and tensions between investments at the national (XSEDE) level vs. at campus levels. Also, tension at NSF between focusing on advanced computation vs. facilitation of the use of computation. Facilitation is not necessarily a campus investment.
Aaron: XSEDE is recognizing the difference between the hardware/infrastructure vs. the people level. Decoupling Campus Champion / facilitation role from XSEDE infrastructure. One issue: very wide variance in the types of people and roles who fill the Campus Champion positions.
Tom D: The tension David G has described is the same as that between EECS faculty/departments and Research IT units. Read letter from [an EECS faculty member] describing the uselessness of cloud GPU resources as data centers (campus and commercial cloud) provide them -- commodity cards in servers (Titan X for example) priced proportionately. Tom D described a 30:1 price ratio between what a campus could theoretically provide in a data center and what Amazon charges for like resources. But in actual fact, have encountered data center staff who regard a request for commodity cards in the data center as serving "only one community," a non-starter. So we're going to do this on an experimental basis, doesn't matter where the boxes are running, e.g., we'll install a few such servers at Santa Cruz, which is open to the possibility.
Patrick: Library, curation, deciding what storage requests ought to be refused.
Tom D: CEPH -- deals with deduplication as one copies around.
Patrick: Data locality. Access. Moving data to compute, or compute to data. Do what the network people suggest: mount the storage and stream it to compute?
David G: Possible to ask researchers to think from two points of view: from the POV of their own research, and from an organizational point of view -- what should your institution or the US as a whole do? Including on the facilities/infrastructure vs. people questions.
David T: Worth thinking about how there are going to be many institutions -- not research universities at Berkeley's scale -- asking for more hardware. Might be worth considering how our response ought to consciously articulate needs specific to institutions like ours.
Rick/Patrick: A need here. Because we don't have a medical school (which has driven the development of a staff who can handle the intricacies of data security requirements) we need an alternate means/need to drive secure data management.
Chris H: Love Your Data week event on Tuesday at BIDS re: securing research data.
Aaron: awareness of how API keys get exposed in version-controlled (and publicly released) code -- a problem that happens and can seriously breach security.
Chris H: securely managing data touches policy, touches monitoring of network traffic -- things that are managed in silos on the campus
Tom D: A problem in funding silos as well.
Patrick: Which leads to "reinnovation" ...