Overview
EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.
We are seeking a highly skilled Senior Data Platform Operations Engineer to ensure the stability, security, performance, and cost efficiency of our global enterprise data platform.
This role is critical for providing 8 / 5 operational coverage as part of the follow-the-sun 24x5 support model, ensuring the platform continuously supports business activities worldwide. The ideal candidate will possess expertise in cloud-based data platforms, a strong operational mindset, and a proactive approach to performance optimization, observability, and cost management.
Responsibilities
- Maintain a stable, secure, and performant enterprise data platform (Snowflake, AWS data stack, dbt, orchestration tools, BI / analytics, etc.)
- Provide operational coverage within an 8 / 5 support model and participate in a 24 / 7 on-call rotation for critical incidents
- Implement robust monitoring, alerting, and observability solutions to ensure proactive incident detection and resolution
- Perform platform upgrades, patching, and configuration management in alignment with security and compliance requirements
- Continuously tune system performance to meet evolving business needs
- Use holistic observability frameworks covering infrastructure, data pipelines, and platform services to execute monitoring activities
- Deliver actionable operational insights through monitoring dashboards and reporting
- Identify and implement process automation to improve efficiency and reduce manual interventions
- Suggest and execute continuous improvements to enhance platform resilience, scalability, and cost-effectiveness
- Contribute to infrastructure-as-code and configuration-as-code practices for consistent, repeatable operations
Requirements
Hands-on experience of over 3 years managing cloud-native data platforms (e.g., Snowflake, Databricks, BigQuery, or similar)Proficiency in cloud infrastructure (AWS) with focus on operations, automation, and cost governanceExperience with monitoring and observability tools (Datadog, Prometheus, Grafana, ELK, CloudWatch, etc.)Knowledge of Infrastructure as Code (Terraform, Pulumi, Ansible) and configuration management practicesStrong understanding of networking, security, and compliance in cloud environmentsStrong problem-solving skills with a proactive, service-oriented mindsetAbility to work in a global operations environment with on-call responsibilitiesClear communication and collaboration with engineering, data, and business stakeholdersCommitment to continuous improvement and operational excellenceEnglish language proficiency at an Upper-Intermediate level (B2) or higherNice to have
Experience implementing FinOps frameworks and cost optimization practicesPrior experience in regulated industries (pharma, healthcare, finance) with compliance-driven environmentsFamiliarity with modern data stack tools (dbt, Dagster / Airflow, ThoughtSpot, Tableau, Power BI)Exposure to SRE (Site Reliability Engineering) principles and practicesWe offer
International projects with top brandsWork with global teams of highly skilled, diverse peersEmployee financial programsPaid time off and sick leaveUpskilling, reskilling and certification coursesUnlimited access to the LinkedIn Learning library and 22,000+ coursesGlobal career opportunitiesVolunteer and community involvement opportunitiesEPAM Employee GroupsAward-winning culture recognized by Glassdoor, Newsweek and LinkedInSeniority level
Mid-Senior levelEmployment type
Full-timeJob function
Business Development, Information Technology, and EngineeringIndustries
Software Development, IT Services and IT Consulting, and Pharmaceutical Manufacturing#J-18808-Ljbffr