UC Berkeley's Production Control Shared Services Center (PCSSC) automates workloads, taking manual tasks and converting them to efficient, automated tasks utilizing GoAnywhere Managed File Transfer software scheduled with Control-M. The unit does this work for UCB, UCOP, UCSF, UCPath, and UCSC -- a true UC Shared Service. PCSSC's Laurie Graham will explain the tools used for data transfer, and we'll discuss how they can be applied to research workflows.
When: Thursday, 24 August from 12 - 1pm
Where: 200C Warren Hall, 2195 Hearst St (see building access instructions on parent page).
What: GoAnywhere Managed File Transfer and Control-M Workload Automation for research data transfer
Presenting: Laurie Graham (PCSSC)
Prior to the meeting, please review:
Presenting: Laurie Graham, Jeff Foster (PCSSC)
Aaron Culich, Research IT
D Ross, UCSF*
Deb McCaffrey, Research IT
Jason Christopher, Research IT
Jenn Stringer, ETS / Research IT / CTL -- "RTL"
John Lowe, Research IT*
Krishna Muriki, LBNL & BRC*
Patrick Schmitz, Research IT
Perry Willett, CDL
Quinn Dombrowski, Research IT
Rhett Hillary, UCSF*
Rick Jaffe, Research IT
Ron Sprouse, Linguistics
Steve Masover, Research IT
* via Zoom
Control-M allows building a workflow that includes data transfers -- with predecessor and successor jobs. xMatters communicates issues to job owners/sponsors; and can take some workflow-branching responses. Have Control-M agents running on multiple platforms: RedHat, SUSE, Solaris, Linux flavors on AWS is coming (for UCSC). Licensing agreement for campus allows for as many agents as wanted.
Jenn asks about business model: recharge? Laurie: cost of unit allocated among campus partners, based on assessment of unit costs / average.
Patrick: How to handle login as a user?
Jeff: Ctrl-M owns workflows, logs in as itself.
Patrick: Would require custom agent for every user.
Jeff: Could run a script...
Jeff: Not included in GoAnywhere supported protocols, but they're quite open to requests.
Patrick: Where does encryption happen?
Jeff: Customers can encrypted/decrypt at endpoint. Or, GoAnywhere can use sender keys to decrypt, then re-encrypt for the recipient endpoint using their keys; happens in a temporary space, there's no data at rest to worry about.
Jenn: file size limits?
Quinn: 13 TB? 24 TB? These came up in use cases this week.
Laurie: That big would blow our system, not enough temp space to use it. Can't give a max quantity off top of head; limit is size of temp space.
Patrick: encryption streaming or no?
Jeff: not sure
Rick: Terrabytes to Box or Drive use case. 2 doz cases in last year. Generally many files, even millions.
Jeff: User interface -- ad hoc -- might work.
Patrick: What's wanted is a fire and forget model. Control-M plus rsync?
Aaron/Patrick: using this technology as an orchestration agent to spin up VMs
Laurie: Daniel (?) has been working on something similar. Will investigate.
Patrick: Onboarding new customers?
Laurie: Simple. Open a ticket.
Patrick: Costing based on time staff spends on it?
Laurie: Yes. To estimate we should talk in more detail. Probably not much.
Patrick: We'd want to define reusable workflows, so we don't have to define per customer or per fine-grained use case. Including figuring out how to handle AuthN for multiple user accounts without defining a workflow for each.