Join our team as a ¡ Site Reliability Engineering (SRE) Engineer! the SRE Engineer we’re searching for someone who has fresh ideas and a unique viewpoint, and who enjoys collaborating with a cross-functional team to develop real-world solutions and positive user experiences for every interaction.
Design, build, and maintain automated systems for monitoring, alerting, and incident response.
Manage the reliability of our infrastructure and services, optimizing capacity and performance.
Participate in capacity planning and the scalability of our services to handle traffic growth.
Collaborate with development teams to implement SRE practices, such as Continuous Integration and Continuous Deployment (CI / CD) and the definition of Service Level Objectives (SLOs) and Service Level Agreements (SLAs).
Monitor and improve system performance, reliability, and scalability.
Support incident response and conduct root cause analysis (RCA) of incidents to prevent their recurrence.
5 years understanding the advanced techniques in terraform. ( 5 years of experience in containers, Kubernetes & Helm Charts.
~+5 years of extensive knowledge in MongoDB, Kafka & Postgres (or willing to learn).
~ Python / Groovy does not count)
~ Golang Knowledge is a bonus.
~ Strong English communications skills.
Contractor or “asimilados” Schemes
~ Full Time job
~ 100% Remote work (because balance is key).
Don't hesitate and share your updated resume in English with us so we can review it and have the pleasure to discuss it in more detail.
Senior Reliability • México, México, MX