Site Reliability Engineer (M/F)

Job description

As part of a service team and reporting to its Operations Team Leader, you will be responsible for the stability and quality of the service provided to our customers, working in a demanding environment in terms of security and availability. You will also be responsible for ensuring the reliability, performance and evolution of our applications and cloud services.

MISSIONS :

As a Site Reliability Engineer (SRE), you will have the following responsibilities:

Infrastructure design and development

Designing, deploying and upgrading our cloud infrastructure (mainly AWS) in line with the principles of high availability and scalability
Implement and maintain our Infrastructure as Code solutions via Terraform and CloudFormation
Optimising Docker container architecture and Kubernetes orchestration to maximise performance and resilience
Actively participate in the choice of technical architecture with a cloud-native approach

System reliability and performance

Define, implement and monitor SLIs, SLOs and SLAs for all our critical services
Implement a comprehensive observability strategy (monitoring, logging, alerting, tracing)
Analyse application performance and optimise resources to improve the user experience
Carry out post-incident analyses (blameless postmortems) and implement the improvements identified
Introduce Chaos Engineering practices to proactively test the resilience of systems

Automation and operational excellence

Develop and maintain robust and secure CI/CD pipelines (GitLab CI, Jenkins)
Systematically automate recurring tasks to reduce operational toil
Produce clear, relevant and accessible technical documentation to facilitate onboarding and interventions

Security and governance

Working with the security team to implement best practice in the infrastructure
Ensure that deployments comply with security standards and regulatory requirements
Maintain a secure access and identity management system
Participate in technical audits and contribute to the continuous improvement of safety processes

Profile required

Technical skills :

Solid knowledge of IT architectures, UNIX/Linux systems and IP networks.
Advanced skills in virtualised infrastructure (VMWare, Hyper-V)
Knowledge of at least one programming or scripting language (Python, Bash, Powershell, PHP, Go etc.)
Good knowledge of Git, CI/CD (GitLab) and Docker tools
Knowledge of at least one IAC technology such as Puppet, Chef, Salt, Ansible, and Terraform
Expertise in monitoring and managing the performance of IT services.

Language skillsFluency in written and spoken English and French.

Expected experience:

Significant experience (minimum 10 years) in production engineering and cloudOps.
Good knowledge of public cloud environments (Azure, AWS, GCP), experience with supervision and monitoring technologies (Nagios, ELK, Prometheus, Grafana, etc.)

Areas of expertise:

Expertise in DevOps practices for Cloud Native applications (ideally with Kubernetes).
Proficiency in a scripting or programming language and in at least one infrastructure-as-code technology.

Expected skills :

Ability to work in a dynamic environment, adaptability and strong organisational skills.

Additional benefits:

Relevant certifications in cloud, DevOps or security.
Experience in implementing Site Reliability Engineering (SRE) practices.
Participation in open source projects or contributions to the technology community.

Choose us for :

Our leading position in a fast-growing sector
Support and coaching from experienced colleagues
Our innovative organisation, which will enable you to progress within the Group
Our technical and functional training systems
Our collaborative events
Our co-option and mobility scheme
Our quality of working life programme (teleworking, employee services, CSR commitments, etc.)
And always: profit-sharing, mutual insurance/health insurance, TR card, social and economic committee, RTT, etc.
If you're looking for a rewarding professional life, come and build your career at Cloud Temple!
Your passion, commitment and success will be valued

Cloud Temple is committed to promoting diversity. This position is open to disabled workers with equal skills.

Offer details

Location :
- Lyon
Wages : According to the current scale
Experience : Minimum 10 years
Contract type : OPEN-ENDED CONTRACT