We help our clients enhance their talent attraction capacities, especially in technological profiles.
We constantly innovate and actively seek to find the best solutions for clients and professionals. We understand the needs of our customers and aim to be the industry specialists.
We offer consulting services to technology companies in various areas, including IT, software development, cybersecurity, and project management. Our employees are the reason for the company's existence, and their satisfaction translates into that of our customers.
Job Title : Senior Site Reliability Engineer (SRE)
Experience : 5+ years Location : Mexico / LATAM
Engagement Type : Full-Time / contractual, Fully Remote
Job Description :
We are seeking a skilled Senior Site Reliability Engineer (SRE) to join our offshore team. In this role, you
will be responsible for ensuring the reliability, performance, and scalability of our critical systems. You'll
develop automation, build monitoring solutions, lead incident response, and work closely with
engineering teams to implement infrastructure as code, CI / CD, and cloud-native tools.
Job Responsibilities :
- Maintain the reliability, availability, and performance of critical systems
- Develop and maintain automation scripts and tools to streamline operations
- Develop and maintain monitoring dashboards and alerts
- Lead incident response, conduct post-mortem analysis, and implement preventative measures
- Optimize system performance and scalability
- Implement and maintain security best practices
- Create and maintain comprehensive system and process documentation
- Participate in on-call rotations for 24 / 7 critical system support
Must Haves :
Kubernetes (hands-on experience) – managing and deploying workloadsAWS Cloud Platform – deep understanding and production experienceInfrastructure as Code (IaC) – using tools like Terraform (or CloudFormation / Ansible)Scripting / Programming – Proficiency in Python or GoMonitoring & Alerting – Experience with Prometheus, GrafanaCI / CD Pipelines – Jenkins, GitLab CI, or similarIncident Management – Proven experience in responding to and analyzing outagesLinux Systems & Networking – Strong fundamentalsGood to Haves :
ArgoCD, Linkerd, Karpenter, or other Kubernetes-related toolsLogging tools – Loki, ELK StackSecurity best practices – Cloud and container security knowledgeLeadership / Mentorship – Experience guiding junior engineersPost-mortem writing & RCA – Comfortable documenting incidents and learningsExperience in distributed systems or high-availability architecturesRecruitment Process :
AI-based online screening testAssignment2 client interviewsCEO DiscussionOffer : Successful candidates will receive an offer to join the team.Soft Skills
Excellent verbal and written communication skills in English - MustStrong problem-solving ability with a customer-first mindsetAccountability – Takes ownership of reliability and incident outcomes.Demonstrated ability to operate in high-pressure, multitasking environments independentlyPassion for supporting and helping others