Distinguished Engineer (L7) – DevOps
About HeyDonto
HeyDonto builds reliable data pipelines that connect fragmented healthcare platforms to modern APIs.
We synchronize and standardize data from both on-premise and cloud-based EHR systems into clean, interoperable formats.
Our mission is simple : make healthcare data work the way software should — predictably, securely, and without silos.
The Role
As a Distinguished Engineer (L7) in the DevOps Tribe, you’ll define and evolve the infrastructure that powers HeyDonto’s ecosystem—from Kubernetes clusters and Terraform modules to developer tooling and multi-environment automation. You’ll lead through technical depth, setting standards for reproducibility, reliability, and cloud portability across every environment.
What You’ll Do
- Architect and evolve multi-environment infrastructure across GKE, CloudSQL, Confluent, Temporal, and Cloudflare, encoded in reusable Terraform modules and remote state.
- Lead deployment automation strategy —CLI orchestration and Helm releases—to keep clusters converged deterministically across environments.
- Design and enforce the secrets lifecycle integrating Terraform outputs, SOPS, and 1Password for secure, auditable rotation and distribution.
- Define and implement automated drift detection, IAM regression suites, and compliance guardrails for infrastructure reliability.
- Own the CUE-based configuration system that exports Compose stacks, environment templates, secrets, and Helm values through just export-cue.
- Shape environment parity and portability —abstract provider specifics behind clear interfaces (DNS, storage, ingress, identity) to reduce lock-in and enable repeatable deployments across clouds
- Standardize vendor-neutral telemetry with OpenTelemetry and consistent log / metric conventions to keep observability portable.
- Establish portable identity patterns (OIDC, workload identity, least-privilege IAM mappings) that translate across providers.
- Mentor senior engineers , codify expectations in documentation and tooling, and steward technical decisions across tribes.
- Lead incident response and RCA , strengthening feedback loops between SRE and development teams.
Tech You’ll Work With
Languages : TypeScript, Python, BashInfrastructure : Terraform (multi-provider), Helm, Kubernetes (GKE primary; portable to other managed K8s), Temporal Cloud, Confluent Cloud, CloudflareCloud-Agnostic Interfaces : OpenTelemetry, OIDC / OAuth2, CSI / Ingress abstractions, external-DNS patterns, OCI registriesConfiguration : CUE, Just, Docker Compose, SOPS, 1Password, env templatesTooling : Node CLI, uv, Yarn, gcloud, kubectlObservability : Grafana, Prometheus, vendor-neutral OTel pipelinesCI / CD : GitHub Actions, Conventional Commits, automated drift and policy checksWhat We Value
Clarity over cleverness — explicit, predictable systems.Idempotency, type safety, and observability in everything we build.Portability by design — clean interfaces, minimal provider coupling, documented escape hatches.Shared ownership of infrastructure and developer experience.Documentation and tooling as part of engineering craft.Reliability as the ultimate measure of quality.Qualifications
Required
7+ years building and operating distributed systems or production infrastructure.Proven expertise with Terraform module design (multi-provider), Kubernetes / Helm operations, and environment automation.Experience designing portable architectures—clear separation of concerns, provider-agnostic interfaces, and migration-ready patterns.Advanced knowledge of secure secret distribution with SOPS and 1Password.Proficiency in Python, Node.js, and Bash for automation and operational tooling.Strong understanding of Kafka, Temporal, and distributed workflow systems.Track record of leading through influence—setting technical standards, mentoring seniors, and driving architectural coherence.Preferred
Experience designing and implementing solutions across multiple cloud providers (e.g., AWS, GCP, Azure) to ensure resilience and avoid vendor lock-in.Hands-on experience with OpenTelemetry rollouts to build a unified observability platform, helping proactively identify and resolve performance bottlenecks.Solid understanding of Kubernetes networking , especially configuring Ingress controllers and managing traffic flow.Familiarity with CUE or similar declarative configuration frameworks.Open-source contributions or published writing that demonstrates passion for systems thinking and quality craftsmanship.Why HeyDonto
HeyDonto is a place where senior engineers work at depth. We build systems that last—secure, observable, portable, and self-documenting. We believe in small expert teams solving hard problems the right way, with full ownership from concept to delivery. If you value clarity, autonomy, and precision—and you want your work to make a measurable difference in real systems—this is the place for you.
Hiring Details :
Work Type : HybridLocation : Guadalajara, MéxicoIf you are interested in applying, please send your English Resume through LinkedIn or send it to maria@heydonto.com mentioning the name of the role you are applying for in the subject of the email.When applying, please include :
Salary expectationsAvailability for interviewsEarliest start date