Job Title : Senior Site Reliability Engineer (SRE)
Experience : 5+ years Location : Mexico / LATAM
Engagement Type : Full-Time / contractual, Fully Remote
Job Description :
We are seeking a skilled Senior Site Reliability Engineer (SRE) to join our offshore team. In this role, you
will be responsible for ensuring the reliability, performance, and scalability of our critical systems. You'll
develop automation, build monitoring solutions, lead incident response, and work closely with
engineering teams to implement infrastructure as code, CI / CD, and cloud-native tools.
Job Responsibilities :
- Maintain the reliability, availability, and performance of critical systems
- Develop and maintain automation scripts and tools to streamline operations
- Develop and maintain monitoring dashboards and alerts
- Lead incident response, conduct post-mortem analysis, and implement preventative measures
- Optimize system performance and scalability
- Implement and maintain security best practices
- Create and maintain comprehensive system and process documentation
- Participate in on-call rotations for 24 / 7 critical system support
Must Haves :
Kubernetes (hands-on experience) – managing and deploying workloadsAWS Cloud Platform – deep understanding and production experienceInfrastructure as Code (IaC) – using tools like Terraform (or CloudFormation / Ansible)Scripting / Programming – Proficiency in Python or GoMonitoring & Alerting – Experience with Prometheus, GrafanaCI / CD Pipelines – Jenkins, GitLab CI, or similarIncident Management – Proven experience in responding to and analyzing outagesLinux Systems & Networking – Strong fundamentalsGood to Haves :
ArgoCD, Linkerd, Karpenter, or other Kubernetes-related toolsLogging tools – Loki, ELK StackSecurity best practices – Cloud and container security knowledgeLeadership / Mentorship – Experience guiding junior engineersPost-mortem writing & RCA – Comfortable documenting incidents and learningsExperience in distributed systems or high-availability architecturesRecruitment Process :
AI-based online screening testAssignment2 client interviewsCEO DiscussionOffer : Successful candidates will receive an offer to join the team.Soft Skills
Excellent verbal and written communication skills in English - MustStrong problem-solving ability with a customer-first mindsetAccountability – Takes ownership of reliability and incident outcomes.Demonstrated ability to operate in high-pressure, multitasking environments independentlyPassion for supporting and helping others