Globus Makes Transfer of Scientific Data Easy
Globus a new service provided by the HPC group of Scientific IT Services (ITS SIS).
Globus, as a Zurich native I first associate this brand name with a department store at Bahnhofstrasse. In the world of research computing Globus stands for a data management service, which allows to transfer and manage terabytes of data easily around the globe.
In a nutshell Globus has been developped to handle scientific data between data pools located at ETH or in other institutions around the world. With Globus you can:
- Transfer files: From kilobytes to petabytes, with Globus you can efficiently, reliably, and securely move data between systems within ETH Zurich or across an ocean.
- Share file with others: All you need is an email address to share data with colleagues – Globus manages authentication and access. And you also can share data publicly.
- Develop applications and gateways: With REST APIs and Python SDK empower you to create an integrated ecosystem of research data services, applications, and workflows
Globus is widely used by high performance computing (HPC) providers. The simple reason is that HPC often means that big data sets are involved. It is also common that data between HPC sites needs to be exchanged, either by users having accounts on different sites or for collaborative research projects.
Globus steaming from the time when cloud was called grid
The Globus service makes use of gridFTP. which is based on the good old FTP technology. In fact it could be named as an FTP service on steroids with advanced features. Some of these advanced features are parallelisation of the data streams and an encrypted authentication channel based on cryptographic certificates.
Originally gridFTP was part of the open-source Globus Toolkit, the Swiss knife of grid computing. Grid is nowadays called cloud and has become usable for digital natives. The biggest obstacle for this legacy toolkit and hence gridFTP to become successful, was the cumbersome certificate handling. Nevertheless gridFTP itself has become a widely used tool to transfer huge amount of data.
Massive file transfer as a service
The main usability problems with gridFTP were two fold. First of all managing certificates, especially so-called short live certificates, was tedious, even for grid professionals. Second a graphical user interface was missing.
At this point our IT services colleagues from the University of Chicago took over. They built up a system which makes the certificate handling and thus the user authentication transparent and easy to be used for research institutions worldwide. To achieve this goal they make use of tools provided by the Internet2 Network. That means that you can use your ETH Zurich credentials to login into Globus, in the same way as you login into many of our webpages.
The Globus services includes a web based file browser, which makes it easy to copy large amount of data directly from site A to site B, while you are sitting on your notebook or workstation.
Globus subscription for ETH Zurich
Maintaining the Globus service doesn’t come for free. Therefore the University of Chicago has decided to enable the more advanced services for subscribing institutions only.
To offer the HPC users the full potential of the Globus service the HPC group of Scientific IT Services acquired a subscription which is valid for the whole ETH Zurich. That means users can simply use Globus with their ETH user name and password.
Departments, Institutes or other units who would like to connect their own storage to Globus can obtain a paid sub-subscription from the HPC team. For more information please contact firstname.lastname@example.org.
How to get started with Globus
To take first glimpse of Globus point your browser to http://app.globus.org and select ETH Zurich from the drop down menu and log into with your ETH user name password. In the file browser which opens you can search for ETH Zurich or for Euler. The Euler services will be established in the next weeks. To get informed please subscribe to the hpc-globus mailing list.
Instruction how to use Globus on Euler will be available soon on scicomp.ethz.ch.