We are seeking an experienced leader to guide our Site Reliability Engineering (SRE) practice and drive reliability at scale. As a key member of our team, you will be responsible for designing, building, and operating resilient infrastructure that meets the needs of our business.
Job Description
As a Reliable Infrastructure Leader, you will :
- Provide technical leadership for the SRE team, setting standards for reliability, scalability, and security.
- Lead the design and implementation of cloud infrastructure and drive adoption of automation, monitoring, and observability solutions.
- Act as a primary point of escalation for complex technical issues, guiding resolution efforts and ensuring long-term stability.
- Partner closely with cross-functional teams to troubleshoot, resolve, and proactively prevent production incidents.
Requirements
To be successful in this role, you will need :
5+ years of experience in Site Reliability Engineering or DevOps, including leadership or mentorship responsibilities.Proven track record of designing and operating large-scale, cloud-based infrastructure.Expertise in automation, CI / CD pipelines, monitoring, and observability tools.Hands-on experience with containerization, orchestration, and infrastructure-as-code tools.Excellent problem-solving skills with the ability to make sound decisions under pressure.Benefits
As a member of our team, you can expect :
Opportunities for professional growth and development.A dynamic and collaborative work environment.The chance to work on high-visibility projects that impact the business.