Talent.com
Site Reliability Engineer (Azure)

Site Reliability Engineer (Azure)

EPAM SystemsMexico
Hace 3 días
Descripción del trabajo

3 weeks ago Be among the first 25 applicants

Get AI-powered advice on this job and more exclusive features.

EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.

Overview

Join our team as a Site Reliability Engineer , where you will focus on cloud infrastructure, containerization, and monitoring using Kubernetes and Microsoft Azure.

You will work closely with clients to ensure robust observability and efficient deployment pipelines. Apply now to contribute to maintaining and improving distributed systems at scale.

Responsibilities

  • Create and manage containerized applications using Docker or Podman
  • Deploy and maintain Kubernetes resource manifests in clusters such as Kind, GKE, or AKS
  • Implement and monitor Prometheus agents to observe infrastructure and application metrics
  • Troubleshoot and analyze logs to identify and resolve system events and issues
  • Develop and maintain Azure DevOps CI / CD pipelines and GitOps deployment workflows
  • Collaborate with teams to improve system reliability and deployment automation
  • Manage infrastructure as code using Terraform and other tools
  • Configure and maintain observability tools and alerting systems
  • Ensure compliance with client constraints and security standards
  • Participate in incident response and root cause analysis
  • Document system configurations, processes, and procedures
  • Support continuous improvement of deployment and monitoring practices

Requirements

  • Hands-on programming experience of at least 2 years
  • Proficiency in at least one scripting language
  • Experience with Kubernetes container orchestration
  • Knowledge of at least one cloud provider including Microsoft Azure or Google Cloud Platform
  • Familiarity with Prometheus or similar monitoring tools for observability
  • Experience with Azure DevOps CI / CD pipelines or GitOps tools like Helm and ArgoCD
  • Understanding of distributed systems troubleshooting and log analysis
  • Practical skills in containerization using Docker or Podman
  • Experience creating and managing Kubernetes resource manifests
  • Ability to deploy and monitor Prometheus agents
  • Knowledge of infrastructure as code tools such as Terraform
  • Strong problem-solving and analytical skills
  • Effective communication and teamwork abilities
  • English proficiency at B2 level or higher
  • We offer

  • International projects with top brands
  • Work with global teams of highly skilled, diverse peers
  • Employee financial programs
  • Paid time off and sick leave
  • Upskilling, reskilling and certification courses
  • Unlimited access to the LinkedIn Learning library and 22,000+ courses
  • Global career opportunities
  • Volunteer and community involvement opportunities
  • EPAM Employee Groups
  • Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn
  • Seniority level

  • Associate
  • Employment type

  • Full-time
  • Job function

  • Engineering, Information Technology, and Business Development
  • Industries

  • Software Development, IT Services and IT Consulting, and Nanotechnology Research
  • Referrals increase your chances of interviewing at EPAM Systems by 2x

    Get notified about new Site Reliability Engineer jobs in Mexico .

    Current openings

  • Junior Site Reliability Engineer – Azure DevOps
  • Sr. Site Reliability Engineer (Remote, Mexico)
  • Site Reliability Engineer (SRE) – Cloud Ops Focus (Mexico Only)
  • AI Software Engineer (Generative AI) - Remote
  • We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

    #J-18808-Ljbffr

    Crear una alerta de empleo para esta búsqueda

    Site Reliability Engineer • Mexico

    Ofertas relacionadas
    • Oferta promocionada
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    DuckDuckGoMexico
    Teletrabajo
    Be among the first 25 applicants.Hi, we're DuckDuckGo, the online protection company and remote-first team of 300+ on a mission to raise the standard of trust online. Founded in 2008 and profitable ...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Incode TechnologiesMexico
    Incode is the leading provider of world-class identity solutions that is reinventing the way humans authenticate and verify their identities online to power a world of digital trust.Through our rev...Mostrar másÚltima actualización: hace 23 días
    • Oferta promocionada
    Sr. Site Reliability Engineer (Remote, Mexico)

    Sr. Site Reliability Engineer (Remote, Mexico)

    NovaMexico
    Teletrabajo
    Site Reliability Engineer (Remote, Mexico).Site Reliability Engineer (Remote, Mexico).Site Reliability Engineer (Remote, Mexico). Be among the first 25 applicants.Site Reliability Engineer (Remote, ...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    • Nueva oferta
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    IncodeMexico
    Incode is the leading provider of world-class identity solutions that is reinventing the way humans authenticate and verify their identities online to power a world of digital trust.Through our rev...Mostrar másÚltima actualización: hace 1 hora
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    Canonical Group LtdMexico
    Teletrabajo
    Canonical is a pioneering open source software company best known for publishing Ubuntu.We operate globally with a distributed workforce and few office-based roles. Teams collaborate in person 2–4 t...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    One IncMexico
    We are looking for a motivated and detail-oriented Junior / Mid-level SRE to join our small, agile team.This role focuses on enhancing and supporting our current monitoring infrastructure, onboarding...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    PerficientMexico
    We currently have a career opportunity for a.Senior Site Reliability Engineer.Mexico or Colombia (only this locations).As a Senior Technical Consultant you will participate in all aspects of the so...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Senior Site Reliability Engineer (SRE)

    Senior Site Reliability Engineer (SRE)

    EPAM SystemsMexico
    Teletrabajo
    Be among the first 25 applicants.EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employ...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    • Nueva oferta
    Site Reliability Engineer - Digital Pay Team

    Site Reliability Engineer - Digital Pay Team

    TRSSMexico
    Teletrabajo
    Site Reliability Engineer - Digital Pay Team.Are you passionate about bringing your experience to a world-class company that is market-leading in content and technology? If yes, we’re looking for y...Mostrar másÚltima actualización: hace 1 hora
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    HCLTechMexico, Mexico
    HCLTech is a global technology company, home to more than 223,000 people across 60 countries, delivering industry-leading capabilities centered around digital, engineering, cloud and AI, powered by...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    New Era TechnologyMexico, Mexico
    Site Reliability Engineering (SRE) Engineer! the SRE Engineer we’re searching for someone who has fresh ideas and a unique viewpoint, and who enjoys collaborating with a cross-functional team to de...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    BaufestMexico, Mexico
    Estamos en búsqueda de un / a Ingeniero / a SRE senior para potencialmente sumarse a un proyecto de consultoría.El rol tendrá como objetivo fortalecer la confiabilidad, estabilidad y resiliencia de los...Mostrar másÚltima actualización: hace 22 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    KI peopleMexico
    Teletrabajo
    Be among the first 25 applicants.Direct message the job poster from KI people.In Search of the Best Global IT & Digital Talent. The SRE Operations specialist focuses on B2B applications support prov...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Customer.ioMexico
    Teletrabajo
    Over 7,500 companies - from scrappy startups to global brands - use our platform to send billions of emails, push notifications, in-app messages, and SMS every day. We help teams send smarter, more ...Mostrar másÚltima actualización: hace 26 días
    • Oferta promocionada
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    One IncMexico
    Teletrabajo
    Senior Site Reliability Engineer.Design and develop internal tools and automation scripts to support SRE and infrastructure tasks. Collaborate closely with SRE team members to identify automation op...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer (SRE) – Cloud Ops Focus (Mexico Only)

    Site Reliability Engineer (SRE) – Cloud Ops Focus (Mexico Only)

    VaricentMexico
    Teletrabajo
    Site Reliability Engineer (SRE) – Cloud Ops Focus (Mexico Only).At Varicent, we’re not just transforming the Sales Performance Management (SPM) market—we’re redefining how organizations achieve rev...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    ConfidencialMexico, Mexico
    Estamos en búsqueda de un / a Ingeniero / a SRE senior para potencialmente sumarse a un proyecto de consultoría.El rol tendrá como objetivo fortalecer la confiabilidad, estabilidad y resiliencia de los...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Junior Site Reliability Engineer – Azure DevOps

    Junior Site Reliability Engineer – Azure DevOps

    EPAM SystemsMexico
    EPAM is a leading global provider of digital platform engineering and development services.We are committed to having a positive impact on our customers, our employees, and our communities.We embra...Mostrar másÚltima actualización: hace 3 días