Our team

Cloud Site Reliability Engineering Expert


Cloud Site Reliability Engineering Expert

Job Location

Remote in Romania

Available Positions

1 positions

Job Type


Job Ref. Number


We are looking for a Cloud Site Reliability Engineering Expert to join our team in Bucharest.

Role Description

Main Tasks / Responsibilties

  • Buildup, development and further operation of infrastructure as code components (e.g. for AWS services, Kubernetes, container solutions) incl. taking on-call responsibilities
  • Development of automated solutions for operational aspects such as on-call monitoring, performance and capacity planning, and disaster response
  • Creating fault tolerant and self-healing infrastructure components that impoves the reliability of systems, fixing issues and responding to incidents
  • Ensuring and implementing security and connectivity requirements Definition of standards for the cloud and on-premise platforms, components and systems
  • Advice and support for departments in the introduction and use of cloud services
  • Definition and conception of standards in the areas of cloud, containers and container orchestration systems, including an associated shared responsibility model
  • Consulting for strategic inquiries from departments (e.g. for security, VCI) Controlling cloud costs and developing measures for continuous cost reduction

Interfaces / Communication (customer, suppliers and other)

  • Group Shared Service Teams (e.g. Indien, Ägypten)
  • Project management & product owner of the delivery departments Local and Global Security Features
  • Customers: delivery, test and operations departments (agile) Release & Change Manager
  • Internal and external data center partners (VCI, AWS, etc.) TI partner, license and vendor management
  • Represent the in national / international committees and conferences

Personal requirements

Training / work experience

  • Completed technical or industrial engineering degree or longtime
  • experience in the field of Public Cloud infrastructure, IT engineering and IT operation
  • Practical experience in the field of public cloud solutions, setting up and managing infrastructure as code in AWS >3 years
  • Hands-on experience with Kubernetes, Docker, Helm and Clustermanagement >3 years
  • Safe handling in the area of automation and configuration management of infrastructure deployments (Terraform, Ansible, Cloudformation, Concourse, Jenkins, groovy, Nexus, Bitbucket, etc.) >3 years
  • Several years of professional experience in IT, preferably in IT engineering, IT operations, IT consulting or IT architecture >5 years
  • Good technical knowledge of the technologies and products used in large, heterogeneous Public Cloud infrastructures and the services they provide
  • Pronounced analytical, solution-oriented and entrepreneurial thinking and acting as well as pronounced technical know-how in the field of application

Knowledge / skills

  • Fluent in spoken and written English
  • Experience in intercultural communication Experience in negotiations and in dealing with customers and suppliers
  • Knowledge of IT architectures and IT service management processes (ITIL) and business processes (PPM, internal ordering and communication processes, etc.)
  • High resilience and experience in crisis management for the subject-related topics
  • Knowledge of dealing with complex technical
  • issues and for solving problems as well as preparing information for the addressees Professional curiosity about new methods, tools and trends with "self-starter mentality" and high intrinsic motivation to work on new topics and projects
  • Independent customer and service-oriented way of working, diplomatic skills, high stress resistance and above-average commitment, paired with pronounced communicative and rhetorical skills as well as determination and responsibility