Skip to end of metadata
Go to start of metadata

UC Berkeley's Production Control Shared Services Center (PCSSC) automates workloads, taking manual tasks and converting them to efficient, automated tasks utilizing GoAnywhere Managed File Transfer software scheduled with Control-M. The unit does this work for UCB, UCOP, UCSF, UCPath, and UCSC -- a true UC Shared Service. PCSSC's Laurie Graham will explain the tools used for data transfer, and we'll discuss how they can be applied to research workflows.

When: Thursday, 24 August from 12 - 1pm
Where: 200C Warren Hall, 2195 Hearst St (see building access instructions on parent page).
What: GoAnywhere Managed File Transfer and Control-M Workload Automation for research data transfer
Presenting: Laurie Graham (PCSSC)

Prior to the meeting, please review:



Presenting: Laurie Graham, Jeff Foster (PCSSC)

Attending:

Aaron Culich, Research IT
D Ross, UCSF*
Deb McCaffrey, Research IT
Jason Christopher, Research IT
Jenn Stringer, ETS / Research IT / CTL -- "RTL"
John Lowe, Research IT*
Krishna Muriki, LBNL & BRC*
Patrick Schmitz, Research IT
Perry Willett, CDL
Quinn Dombrowski, Research IT
Rhett Hillary, UCSF*
Rick Jaffe, Research IT
Ron Sprouse, Linguistics
Steve Masover, Research IT

* via Zoom

[see slides]

Control-M allows building a workflow that includes data transfers -- with predecessor and successor jobs. xMatters communicates issues to job owners/sponsors; and can take some workflow-branching responses. Have Control-M agents running on multiple platforms: RedHat, SUSE, Solaris, Linux flavors on AWS is coming (for UCSC). Licensing agreement for campus allows for as many agents as wanted.


DISCUSSION:

Jenn asks about business model: recharge? Laurie: cost of unit allocated among campus partners, based on assessment of unit costs / average.

Patrick: How to handle login as a user?
Jeff: Ctrl-M owns workflows, logs in as itself.
Patrick: Would require custom agent for every user.
Jeff: Could run a script...

Rick: GridFTP?
Jeff: Not included in GoAnywhere supported protocols, but they're quite open to requests.

Patrick: Where does encryption happen?
Jeff: Customers can encrypted/decrypt at endpoint. Or, GoAnywhere can use sender keys to decrypt, then re-encrypt for the recipient endpoint using their keys; happens in a temporary space, there's no data at rest to worry about.

Jenn: file size limits?
Quinn: 13 TB? 24 TB? These came up in use cases this week.
Laurie: That big would blow our system, not enough temp space to use it. Can't give a max quantity off top of head; limit is size of temp space.
Patrick: encryption streaming or no?
Jeff: not sure

Rick: Terrabytes to Box or Drive use case. 2 doz cases in last year. Generally many files, even millions.
Jeff: User interface -- ad hoc -- might work.
Patrick: What's wanted is a fire and forget model. Control-M plus rsync?

Aaron/Patrick: using this technology as an orchestration agent to spin up VMs
Laurie: Daniel (?) has been working on something similar. Will investigate.

Patrick: Onboarding new customers?
Laurie: Simple. Open a ticket.
Patrick: Costing based on time staff spends on it?
Laurie: Yes. To estimate we should talk in more detail. Probably not much.
Patrick: We'd want to define reusable workflows, so we don't have to define per customer or per fine-grained use case. Including figuring out how to handle AuthN for multiple user accounts without defining a workflow for each.

[...]




  • No labels