As part of a service team and reporting to its Operations Team Leader, you will be responsible for the stability and quality of the service provided to our customers, working in a demanding environment in terms of security and availability. You will also be responsible for ensuring the reliability, performance and evolution of our applications and cloud services.
MISSIONS :
As a Site Reliability Engineer (SRE), you will have the following responsibilities:
Infrastructure design and development
- Designing, deploying and upgrading our cloud infrastructure (mainly AWS) in line with the principles of high availability and scalability
- Implement and maintain our Infrastructure as Code solutions via Terraform and CloudFormation
- Optimising Docker container architecture and Kubernetes orchestration to maximise performance and resilience
- Actively participate in the choice of technical architecture with a cloud-native approach
System reliability and performance
- Define, implement and monitor SLIs, SLOs and SLAs for all our critical services
- Implement a comprehensive observability strategy (monitoring, logging, alerting, tracing)
- Analyse application performance and optimise resources to improve the user experience
- Carry out post-incident analyses (blameless postmortems) and implement the improvements identified
- Introduce Chaos Engineering practices to proactively test the resilience of systems
Automation and operational excellence
- Develop and maintain robust and secure CI/CD pipelines (GitLab CI, Jenkins)
- Systematically automate recurring tasks to reduce operational toil
- Produce clear, relevant and accessible technical documentation to facilitate onboarding and interventions
Security and governance
- Working with the security team to implement best practice in the infrastructure
- Ensure that deployments comply with security standards and regulatory requirements
- Maintain a secure access and identity management system
- Participate in technical audits and contribute to the continuous improvement of safety processes
Technical skills :
- Solid knowledge of IT architectures, UNIX/Linux systems and IP networks.
- Advanced skills in virtualised infrastructure (VMWare, Hyper-V)
- Knowledge of at least one programming or scripting language (Python, Bash, Powershell, PHP, Go etc.)
- Good knowledge of Git, CI/CD (GitLab) and Docker tools
- Knowledge of at least one IAC technology such as Puppet, Chef, Salt, Ansible, and Terraform
- Expertise in monitoring and managing the performance of IT services.
Language skillsFluency in written and spoken English and French.
Expected experience:
- Significant experience (minimum 10 years) in production engineering and cloudOps.
- Good knowledge of public cloud environments (Azure, AWS, GCP), experience with supervision and monitoring technologies (Nagios, ELK, Prometheus, Grafana, etc.)
Areas of expertise:
- Expertise in DevOps practices for Cloud Native applications (ideally with Kubernetes).
- Proficiency in a scripting or programming language and in at least one infrastructure-as-code technology.
Expected skills :
Ability to work in a dynamic environment, adaptability and strong organisational skills.
Additional benefits:
- Relevant certifications in cloud, DevOps or security.
- Experience in implementing Site Reliability Engineering (SRE) practices.
- Participation in open source projects or contributions to the technology community.
Choose us for :
- Our leading position in a fast-growing sector
- Support and coaching from experienced colleagues
- Our innovative organisation, which will enable you to progress within the Group
- Our technical and functional training systems
- Our collaborative events
- Our co-option and mobility scheme
- Our quality of working life programme (teleworking, employee services, CSR commitments, etc.)
- And always: profit-sharing, mutual insurance/health insurance, TR card, social and economic committee, RTT, etc.
- If you're looking for a rewarding professional life, come and build your career at Cloud Temple!
- Your passion, commitment and success will be valued
Cloud Temple is committed to promoting diversity. This position is open to disabled workers with equal skills.
-
Location :
- Lyon
- Wages : According to the current scale
- Experience : Minimum 10 years
- Contract type : OPEN-ENDED CONTRACT