Talent.com
Principal Site Reliability Engineer – Cloud, AI / ML, GenAI (Remote)

Principal Site Reliability Engineer – Cloud, AI / ML, GenAI (Remote)

OracleJalisco, Mexico
Hace 3 días
Descripción del trabajo

Principal SRE – Cloud Automation & AI Platforms (Zapopan)

Join to apply for the Principal SRE – Cloud Automation & AI Platforms (Zapopan) role at Oracle .

Job Description

This role requires an SRE mindset combined with AI / ML expertise and strong application engineering skills across public and private cloud environments.

Responsibilities

  • End‑to‑end service ownership : design for telemetry, security, resiliency, scalability, and performance; lead sizing / architecture; drive service health reviews and process simplification.
  • Incident management and prevention : lead postmortems / RCAs, coordinate fixes, define repair items, and implement data‑driven prevention and continuous improvement.
  • AI / ML and GenAI delivery : design and integrate solutions with LLMs, RAG, agentic workflows, and conversational AI; build low‑latency model serving and retraining pipelines.
  • Application engineering : develop performant microservices for distributed, containerized, cloud‑native systems.
  • Automation : eliminate toil by automating operational workflows, recovery procedures, code delivery, and configuration management; build internal tools and reusable scripts / services to accelerate delivery and reduce errors.
  • Observability : define and implement monitoring, logging, alerting, and tracing strategies; establish SLOs / SLIs / error budgets; improve diagnostics and performance visibility for rapid triage.
  • Cross‑functional collaboration : partner with product, operations, and data teams to translate requirements into secure, scalable solutions; communicate effectively with technical and non‑technical stakeholders.

Minimum Qualifications

  • BS / MS in Computer Science or related field; 10+ years of software engineering in cloud environments.
  • Strong background in distributed systems / microservices using Java / Python; SQL / data modeling; Python for AI / automation.
  • SRE / DevOps expertise : systems and networking fundamentals, application security, observability, performance analysis, and incident response.
  • Proven SDLC excellence : code quality, reviews, version control, CI / CD, testing, and release engineering.
  • Excellent written and verbal communication; English fluency.
  • Preferred / Technical Skills

  • AI / ML / GenAI : experience with foundational models, RAG, agentic architectures; model deployment, optimization, monitoring, and retraining.
  • Cloud and containers : experience with containerization, orchestration, and resilient, fault‑tolerant microservices.
  • Observability : hands‑on experience designing dashboards, alerts, traces, logs, and metrics; defining SLOs / SLIs and error budgets; on‑call readiness and runbook quality.
  • Operations : performance tuning across Java / Python and SQL for large‑scale enterprise applications; strong Linux / Unix expertise; capacity planning and reliability reviews.
  • Automation and scripting : proficiency in scripting to automate operational workflows, build tooling, and CI / CD tasks (e.g., shell scripting, Python, configuration‑as‑code, task runners).
  • Familiarity with enterprise ERP applications and standard DevOps tooling and practices.
  • About Us

    As a world leader in cloud solutions, Oracle uses tomorrow’s technology to tackle today’s challenges. We’ve partnered with industry‑leaders in almost every sector—and continue to thrive after 40+ years of change by operating with integrity.

    We know that true innovation starts when everyone is empowered to contribute. That’s why we’re committed to growing an inclusive workforce that promotes opportunities for all.

    Oracle careers open the door to global opportunities where work‑life balance flourishes. We offer competitive benefits based on parity and consistency and support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs.

    We’re committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing accommodation-request_mb@oracle.com or by calling +1 888 404 2494 in the United States.

    Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans’ status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.

    #J-18808-Ljbffr

    Crear una alerta de empleo para esta búsqueda

    Site Reliability Engineer • Jalisco, Mexico

    Ofertas relacionadas
    • Oferta promocionada
    Site Reliability Developer 3

    Site Reliability Developer 3

    OracleJalisco, Mexico
    Site Reliability Developer 3 role at.Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    Valce Talent SolutionsGuadalajara, Jalisco, Mexico
    We are looking for a Lead Site Reliability Engineer who takes the initiative on developing and maintain the system and services for our Cash Management Platform, automating the deployment process, ...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Principal Data Engineer...

    Principal Data Engineer...

    TouchTunesGuadalajara, Jalisco, MX
    Principal Data Engineer Location : Guadalajara, Jalisco Your mission in the Data Engineering team : At TouchTunes, your work matters, we are seeking a highly skilled and experienced Principa...Mostrar másÚltima actualización: hace más de 30 días
    Site Reliability Engineer ID45689

    Site Reliability Engineer ID45689

    AgileEngineZapopan, JAL, mx
    Quick Apply
    Fortune 500 brands and trailblazing startups across 17+ industries.We rank among the leaders in areas like application development and AI / ML, and our people-first culture has earned us multiple Bes...Mostrar másÚltima actualización: hace 2 días
    • Oferta promocionada
    Principal Data Engineer

    Principal Data Engineer

    TouchTunesGuadalajara, Jalisco, Mexico
    Your mission in the Data Engineering team : .At TouchTunes, your work matters, we are seeking a highly skilled and experienced Principal Data Platform Engineer to lead the design, development, and op...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Senior Integration Engineer (Remote, Mexico)

    Senior Integration Engineer (Remote, Mexico)

    NovaGuadalajara, Jalisco, Mexico
    Teletrabajo
    Senior Integration Engineer (Remote, Mexico).IO Connect Services is a fast-growing AWS Advanced Tier Services Partner and Datadog Partner that has achieved continuous success since its inception.Af...Mostrar másÚltima actualización: hace más de 30 días
    AI Principal Software Engineer - US SaaS startup | Remote

    AI Principal Software Engineer - US SaaS startup | Remote

    Atomic HRGuadalajara, Jalisco, .MX
    Quick Apply
    We connect talented tech professionals in Latin America and Canada with remote career opportunities at innovative startups worldwide. We specialize in finding roles that align with your skills, expe...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    • Nueva oferta
    Sustaining Engineer

    Sustaining Engineer

    GlobalLogicGuadalajara, Mexico Metropolitan Area, Mexico
    Sustaining Engineer (Mobile Access Management) – Mexico.Are you passionate about solving complex technical challenges, diving deep into code, and improving products used by thousands of clinicians ...Mostrar másÚltima actualización: hace 10 horas
    • Oferta promocionada
    Principal Cloud Practice Architect

    Principal Cloud Practice Architect

    Rackspace TechnologyJalisco, Mexico, Mexico
    Principal Cloud Practice Architect,.You'll be leading complex network projects, mentoring junior team members, and ensuring the entire network ecosystem is optimized for performance, security, and ...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Detective privado en El Refugio

    Detective privado en El Refugio

    Cronoshare.com.mxEl Refugio (Jalisco), mx
    Cronoshare es una plataforma online para profesionales que quieren encontrar nuevos clientes.Buscamos Detective privado en El Refugio y alrededores. Pertenecer a la red de profesionales de Cronoshar...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Design and Release Engineer (Differentials) - Remote in México...

    Design and Release Engineer (Differentials) - Remote in México...

    JobbydooGuadalajara, MX
    Descripción breve At Capgemini Engineering, the world leader in engineering services, we bring together a global team of engineers, scientists, and architects to help the world’s most innovative c...Mostrar másÚltima actualización: hace 2 días
    • Oferta promocionada
    • Nueva oferta
    Sustaining Engineer...

    Sustaining Engineer...

    GlobalLogicGuadalajara, Mexico Metropolitan Area, MX
    Sustaining Engineer (Mobile Access Management) – Mexico Are you passionate about solving complex technical challenges, diving deep into code, and improving products used by thousands of clinicians...Mostrar másÚltima actualización: hace 9 horas
    • Oferta promocionada
    Design and Release Engineer - Rear axle (Remote in México)...

    Design and Release Engineer - Rear axle (Remote in México)...

    JobbydooGuadalajara, MX
    At Capgemini Engineering, the world leader in engineering services, we bring together a global team of engineers, scientists, and architects to help the world’s most innovative companies unleash th...Mostrar másÚltima actualización: hace 10 días
    Deployment Engineer

    Deployment Engineer

    IT LabsGuadalajara, Jal., MX
    Teletrabajo
    Quick Apply
    At IT Labs, we’re looking for a hands-on .This company is building an AI-powered simulation software stack that enables faster, smarter innovation across industries like Aerospace, Automotive,...Mostrar másÚltima actualización: hace 3 días
    Site Reliability Engineer (Middle / Senior) ID38916

    Site Reliability Engineer (Middle / Senior) ID38916

    AgileEngineZapopan, JAL, mx
    Quick Apply
    Fortune 500 brands and trailblazing startups across 17+ industries.We rank among the leaders in areas like application development and AI / ML, and our people-first culture has earned us multiple Bes...Mostrar másÚltima actualización: hace 24 días
    • Oferta promocionada
    Empresa de construcción y remodelaciones en Atequiza

    Empresa de construcción y remodelaciones en Atequiza

    Cronoshare.com.mxAtequiza (Jalisco), mx
    Cronoshare es una plataforma online para profesionales que quieren encontrar nuevos clientes.Buscamos Empresa de construcción y remodelaciones en Atequiza y alrededores. Pertenecer a la red de profe...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Jr. Packaging Engineer...

    Jr. Packaging Engineer...

    NatureSweetTala, Jalisco, MX
    NatureSweet®is the single-source solution for greenhouse-grown vegetables and is the #1 best-selling brand in snacking tomatoes. The largest vertically integrated agriculture company in North Americ...Mostrar másÚltima actualización: hace 18 días
    • Oferta promocionada
    Site Reliability Engineer Junior

    Site Reliability Engineer Junior

    GrainChain IncJalisco, Mexico
    Teletrabajo
    AI‑powered business operations platform that unifies CRM, scheduling, case management, and workflow automation into one intelligent system of record. We help organizations—especially in healthcare a...Mostrar másÚltima actualización: hace 17 días