Cloud Proof of Concept Project

SyBIT is engaged in a proof-of-concept project in academic cloud computing together with the ETH, UZH and SWITCH. We provide several of the use cases to the project. The motivation behind it has to do with the sustainability effort of SyBIT – all the services and tools that we provide for could be stored in an academic cloud for further use beyond SystemsX in the future. And also in the coming years we would greatly profit of an academic cloud infrastructure for servers and collaboration tools for our projects.

The project is called Academic Cloud Computing and Provisioning and is funded by a SWITCH-AAA grant. There are three testbeds that are set up as part of the project: A commercial ‘private cloud appliance’ from HP called ‘HP CloudSystem Matrix’ – this system is already operational at the ETH; a self-built private cloud also at the ETH and a self-built private cloud at the UZH. In addition, also SWITCH is building their own cloud testbed. All of these are built such that we can run complementary experiments on them both for the infrastructure and application components.

The choice between the many Cloud Stacks is not easy, but after a long and detailed evaluation it was decided to go with OpenStack for all testbeds. We have also hooked up with the Zurich University of Applied Sciences’ Cloud team at ICCLab.

Posted in Infrastructure. No Comments »

SWITCH Storage WG in Bern

Today i attended the SWITCH Storage Working Group meeting. It was very well attended, many of our SyBIT partners were present (FMI, Biozentrum, UZH, ETHZ, EPFL) and also infrastructure people from other universities and universities of applied sciences. In the morning some sites have shown what they have in terms of storage and how they do things. Except for our partners most have focussed on storage infrastructure for the ‘commodity’ services like administration (SAP db, finance files), email, user’s home directories, backup. We have certainly the largest amount of scientific research data, and it easily surpasses all of the other kinds of data. But in terms of technology being used there was a lot of overlap, and it was suggested by several people to cooperate when acquiring hardware and software and to exchange information.

There were presentations on cloud storage, on ideas for distributed archiving and by SWITCH on network pricing. Cloud storage in terms of private cloud storage was seen as useful, but using commercial providers for long-term storage is obviously too expensive. The technology however is interesting also to share data and resources among universities, especially since with SWITCH we would not have any network costs. Small universities could profit from bigger ones, and the big ones might also exchange resources among each other.

The distributed archiving idea is about duplicating data (read only archive data) accross several sites for safety. Each site can have its own long-term archive solution, but for certain data that needs more security and several off-site copies, this is a proposed mechanism to achieve it (storage broker presentation). For the copy mechanism they suggested bittorrent..

Posted in Infrastructure. No Comments »

ETH Zurich IT – D-BIOL Coordination

We hada a very constructive meeting between the following actors:

  • ETH Informatikdienste (ID): People from the Systemdienste, ie. those running the large cluster and storage installations for scientists – Jürgen Winkelmann, Hans Hiltbrunner, Olivier Byrde, Tilo Steiger
  • The Institute of Molecular Systems Biology (IMSB) big users: Lars for Ruedi Aebersold’s lab, Nicola for Uwe Sauer’s lab, Berend and Lukas Pelkmans
  • The Light Microscopy Center (LMC) – Karol Kozak
  • The Institute for Molecular Chemistry (IMC) who actually have a coordinator for IT – Nico Graf
  • ETH Vice President’s office, Mrs Arangeh who heads the ETH Storage strategy group

I have set up this meeting because there seemed to be several misunderstandings and established ‘bad practices’ on both the ID and the user’s side concerning computing and storage. We had presentations by Lars, Nicola and Nico on the individual needs and experience for their respective institutes, as well as suggestions for improvement on the ID side.

We were discussing during 3 hours and i believe we could clarify a lot of details. Among the issues addresssed were

  • NAS size and firewall: Currently the NAS sizes are too small and the connection to the large cluster goes over a firewall, severely limiting performance. Brutus now has its own large storage array where people can buy shares (currently 250TB, with plans to go to 1PB soon). Now we only need to understand how the data will go from the instruments to this array once its being used, but here the data rate is not high so there should be no bottleneck. Still, also the firewall issue is investigated and faser networks are being looked at for this purpose. D-BIOL can act as a pioneer user in this context. Mirroring schemes for the NAS are also interesting.
  • Virtualization: the pricing scheme is still too expensive for most. Lars said his VMs would not be running all the time – right now he would buy a big Dell box and run over 60VMs on it. The ID said they would look into pricing models based on usage, and already now they lowered the prices for the current VMs massively.
  • Archiving solutions: There are planned upgrades here too for the HSM system so that automatical backup can happen
  • Missing is still the large transactional storage space for very large databases (TB range).
  • The scientists will define their data flow in more detail also which data is there to be kept on expensive storage and which data is to be deleted, which to be archived.

Next steps are for the ID to come back with suggestions for solutions. The D-BIOL is also working on a general strategy where information will be distributed among the present parties, will be invited to participate. This has already happened at the time of writing of this blog, and the documents describing the IT strategy for the D-BIOL have been provided to us by Nico Graf (to be kept confidential).

The presentations are availabe on the ID Sharepoint and also on the SyBIT Wiki: Slides by Lars, Nicola and Nico Graf.

Posted in Infrastructure. No Comments »

Discussion with IT D-BSSE

I had a discussion and lunch with Janos Palinkas, head of the IT services at the D-BSSE. We have informed each other about our current plans and needs. He seemed quite favorable of an infrastructure community. He stressed however that he is not an ‘enabled’ person, ie. he cannot make any decision without getting explicit approval from the Department Head. This includes any requests for hosting or collaboration. The current Professor, who has been in the position of Department Head only for a few months (this is a rotating appointment) is not yet up-to-speed with the needs and issues around the IT services. In addition, being an ETH Zürich entity, there are specific constraints on how the machine room has to be built and how the network has to be set up.

What SyBIT can help with here is voice the interest and need for cooperation with the BSSE IT team explicitly, with concrete proposals and plans to be executed (costs included of course). These then can travel the ‘official path’, top down in this case. Janos’s official boss is Jörg Stelling if i understood correctly.

Posted in Infrastructure. No Comments »

CellPlasticity – Deep Sequencing Platform Meeting

Two days ago we had a meeting in Basel about the needs of the CellPlasticity (former CPHD) project concerning the interaction with the Deep Sequencing Platform run by D-BSSE. All players have been at the meeting, both from the scientific side and the BSSE platform responsibles.

Currently the platform is being operated under the responsibility of Christian Beisel, who does this however not as his main job but ‘on the side’ next to his own research. There is one full-time wet-lab person assign to operate the machine and 50% IT support through Manuel Kohler at CISD. This used to be 1FTE until recently. One of the concerns expressed was that this level of support will probably not be sufficient or scale when the demand for the platform goes up in the future. This is where SyBIT will help on the immediate timescale.

Christian will come up with a set of requests within the next 2 weeks and a timeline. He will send us a detailed list of things to do, and will prepare it in tight interaction with the FMI an the Biozentrum.

Posted in Infrastructure. No Comments »