When: Thursday, April 21 from noon - 1pm
Where: 200C Warren Hall, 2195 Hearst St (see building access instructions on parent page).
Event format: The reading group is a brown bag lunch (bring your own). This session will be an open discussion based on the reading (no presentation).
Shane Canon and Doug Jacobsen (NERSC)
Facilitator: Patrick Schmitz
Shifter is a software package that enables regular users to securely and efficiently run custom containers on shared HPC systems. Shifter can run images from a Docker registry and Docker Hub, but uses a custom run-time that integrates with typical HPC environments. Shifter enables users to realize many of the productivity benefits of Docker while working within the many constraints of a typical shared resource. This meeting's presentation will describe how Shifter works and some examples of how Shifter has enabled scientists to more easily use HPC resources at NERSC.
Following their presentation, Shane Canon and Doug Jacobsen -- members of NERSC's team who are developing, deploying, and testing Shifter, and authors of the paper linked below – will lead a discussion about using Shifter to execute research workflows on HPC systems. Members of the Berkeley Research Computing staff who are exploring the possibility of using Shifter to run custom containers on Savio will also participate.
Please read prior to the meeting: Contain This, Unleashing Docker for HPC (PDF)
Presenting: Doug Jacobsen, NERSC [Note: Shane Canon had a last-minute obligation that prevented his attendance]
Facilitating: Patrick Schmitz
Aaron Culich, Research IT
Aron Roberts, Research IT
Barbara Gilson, SAIT
Bernard Li, BRC & LBNL
Bill Allison, CTO & IST-API
Chris Paciorek, Research IT
Greg Kurtzer, BRC & LBNL
Jason Christopher, Research IT
Jason Huff, CGRL
John Lowe, Research IT
John White, BRC & LBNL
Kelly Rowlnad, BRC & Nuclear Engineering
Krishna Muriki, BRC & LBNL
Larry Conrad, CIO (sitting in briefly toward end of presentation)
Quinn Dombrowski, Research IT
Patrick Schmitz, Research IT
Rachel Slaybaugh, Nuclear Engineering
Rick Jaffe, Research IT
Steve Masover, Research IT
Yong Qin, BRC & LBNL
Patrick: Strong interest in multiple modes of computation atop HPC resources. Experiments in progress.
Doug Jacobsen, presentation (see slides - PDF):
* Shifter, "environment containers"
* Motive: to run contained environments on the Cori (Cray) supercomputer. Complex software stacks -- "big software" on HPC ... LHC software stack example = 1 TB ... a lot of time to load onto standard Cori nodes; contained environments opens another path.
* 'The job of a supercomputer is to turn a CPU-bound problem into an I/O-bound problem' ... but fat-pipe network currently available is shifting the rate-limiting factor in the other direction.
* Users desire consistent environment -- desire to make transition of desktop ==> supercomputer more transparent
* Single metadata service on a Lustre file system means DLLs are unacceptably slow to load. Environment containers sidestep that problem by letting the images link in the contained environment on the node, no necessity to invoke Lustre's metadata service
* Design goals for Shifter on Cori: user independence (DIY, no sys admin intervention needed and therefore no sys admin latency introduced); shared resource availability; integration with public image repositories ... fast startup time, "native" application execution performance
* You get to be root when you create your image; you run as yourself -- as a user -- on the supercomputer
* Cannot write to OS image while running. This is a requirement given the use of an archive file (squashfs) to contain the image filesystem and sharing that single file among multiple nodes. Metadata from the squashfs image gets read on mount and cached on each compute node that mounts it.
Patrick: DJ has described a number of things that a user must understand about Shifter and the site's implementation -- beyond running a Docker image. How is this playing out in the real world wrt supporting users.
DJ: Leaving aside MPI -- a special, advanced topic -- the problems we've been seeing have been more-or-less confined to users trying to run images that require root permission to read files inside the container. Not complex, fixable.
Patrick: How many beyond NERSC are using this?
DJ: 3 sites have it going beyond NERSC and CSCS. Cray is building a product out of it and installing, not sure who / how many are using it. Requests coming in to join mailing list at ~3/week (for this week, at least). A small number of very enthusiastic users.
Patrick: Support of user defined images? Could say -- you broke it you fix it ... but should you?
DJ: Could do for images that run on a single node. For MPI you probably don't want to do that. Looking forward: definition of a dividing line between what the site is responsible for and what the user must handle.
DJ: Cray is using Shifter to run Spark on their clusters
Patrick: In situations (e.g., Atlas) where results are acceptable only if run on a certified stack, is there any issue with what Shifter does to tear apart and rebuild the stack?
DJ: Not aware, but am aware that folks are still working out workflow for building certified stacks/images