SREcon Europe/Middle East/Africa 2018
SRECon is a world wide conference that brings together system, production and software engineers to talk about SRE (Site Reliability Engineering) culture, system engineering, and complex distributed systems at scale. The conference happens in US and Europe but in different time of the year. Since last year SRECon Europe encompass Europe, middle east and Africa. This year the SRECon EMEA was hosted in Dusseldorf, Germany.
SRECon EMEA 2018
As cloud, container and microservice architecture adoption grows, the need for scalable solutions to operate them becomes a must have for all size industries. In addition, if a given industry player wants to keep competitors away, the user experience hast to be fulfilled. Services must be reliable (availability and performance) while constantly adapting to their users feature requests. SRECon provides a venue to participants of diverse industry sectors to share their successes and failures when dealing with those topics.
The main topics of this year at SRECon were: (i) principles of SRE and how to adopt them; (ii) EU New Data Protection Law and (iii) technologies/solutions to reduce toil. A lot was also discussed about DevOps vs. SRE or better said culture vs. implementation.
The talks/discussions on “principles of SRE and how to adopt them” were very interesting and showed how different companies implement their SRE solutions. These differences are justified by the business needs, costs of such a team and difficulty in hiring SREs. One of the topics discussed was post-mortems. Niall Murphy from Microsoft presented the idea of creating a consortium, where companies could share their post-mortems. Chastity Blackwell talked about how documentation is important when dealing with incidents and as a institutional knowledge.
The EU New Data Protection Law was topic for talks, discussions, and a workshop. Simon McGarr, a leading expert in data protection pointed at parts of the law to highlight the difficulties when implementing it and the impact the implementations will have on SREs daily work.
One of the hot topics on reducing toil in the era of microservices is kubernetes. It is an open source container orchestration solution that aims to simplify container operations. This year, Shopify presented their journey to kurnbenets in two talks. They have started with kurbenetes on-premise (SRECon17 talk), but are moving to a managed solution running on Google cloud. Main reasons to do that were: (i) lack of expertise on different areas needed; (ii) engineers can focus on shopify business; (iii) operation costs of kubernetes.
The DevOps vs. SRE discussions brought up the principles of DevOps pointing out the ones SRE implements. For instance, documentation, blameless postmortem, service level objective and indicator (SLO and SLI), constant changes. They highlighted the importance of the whole organization embracing these principles and that employees, from management to engineers, should work together to implement them. Some of the discussion can be read in the books: Site Reliability Engineering and The Site Reliability Workbook.
Besides the talks and workshops there were exhibition from Google, Microsoft, Booking.com, Shopify and Facebook. Hands on workshops, where participants could design or learn a system in detail, were also part of the conference. This time Switzerland was represented by attendees from ETH Zurich, CERN, Uni Lausanne and Swisscom.
The program with some media and slides is available in https://www.usenix.org/conference/srecon18europe/program.
Text and Contact
Dr. Christiane Pousa, High Performance Computing, Scientific IT Services, IT Services