Companies worth working for

Tandem Ventures
companies
Jobs

Founding Site Reliability Engineer (Compliance, Security & Reliability)

Enzo Health

Enzo Health

Software Engineering, Compliance / Regulatory
Lehi, UT, USA
Posted on Dec 16, 2025

Location

Lehi Office

Employment Type

Full time

Location Type

On-site

Department

Engineering

About the role:

We’re hiring our first SRE to build the operational foundation of our platform. You’ll own compliance readiness, security posture, and production reliability across our AWS + Kubernetes environment and our application stack (Next.js/Vercel, Sentry, Postgres). We deploy and manage services using Porter (porter.run) and infrastructure-as-code via Terraform.This is a hands-on role for someone who can set direction, implement guardrails, and build scalable systems and processes without slowing product delivery.

Core responsibilitiesCompliance (Core)

  • Lead audit readiness for frameworks such as SOC 2 (and HIPAA-aligned controls as needed): define controls, implement them, and run evidence collection.

  • Establish repeatable processes for access reviews, change management, incident management, vendor risk management, and secure SDLC practices.

  • Automate compliance workflows where possible (continuous controls monitoring, evidence generation, audit trails, policy templates).

Security (Core)

  • Own cloud security architecture in AWS and Kubernetes: least-privilege IAM/RBAC, network segmentation, encryption standards, secrets management, and secure defaults.

  • Harden Kubernetes workloads: cluster baseline security, namespace boundaries, pod security standards, image provenance/scanning, and secure service-to-service communication.

  • Implement and tune security monitoring and incident response: centralized logging, actionable alerts, runbooks, on-call workflows, and post-incident reviews.

  • Drive vulnerability management across infra and app dependencies: patching, dependency scanning, container image scanning, and configuration drift detection.

  • Partner with engineering on threat modeling for major features and high-risk changes.

Reliability (Core)

  • Define and own SLIs/SLOs, establish operational KPIs, and introduce error budgets where appropriate.

  • Improve observability across AWS + Kubernetes + apps using Sentry and monitoring best practices (metrics, logs, tracing, dashboards, alert routing).

  • Own production operations for Postgres: backups/restores, replication strategy, migration safety, performance tuning, and capacity planning.

  • Build resilience: disaster recovery planning, recovery testing, high-availability patterns, and graceful degradation.

Infrastructure, Kubernetes & Delivery Enablement

  • Own infrastructure-as-code using Terraform: module standards, environment structure, state management, reviews, and guardrails.

  • Own the platform layer around Kubernetes and Porter (porter.run): cluster lifecycle practices, environment management, deployment workflows, and reliability of the delivery pipeline.

  • Improve CI/CD and deployment safety: progressive delivery, rollbacks, environment parity, and release observability.

Our stack

  • AWS, Terraform

  • Kubernetes, Porter (porter.run)

  • Next.js, Vercel

  • Postgres

  • Sentry

What success looks like (first 3–6 months)

  • Compliance roadmap is established and actively executed (audit evidence is increasingly automated).

  • AWS + Kubernetes have secure baselines: strong IAM/RBAC, secrets management, encryption defaults, and centralized logging.

  • SLOs exist for key services, incidents are handled consistently, and postmortems drive measurable reliability gains.

  • Postgres has tested backups/restores, solid monitoring, and a scaling/reliability plan.

  • Porter/Kubernetes delivery workflows are reliable, observable, and safe to operate.

Qualifications

  • 6+ years in SRE / Platform / Security Engineering (or similar), owning production systems end-to-end.

  • Strong experience with AWS plus hands-on Kubernetes operations in production.

  • Strong Terraform experience (modules, environments, drift control, guardrails).

  • Experience leading or significantly contributing to SOC 2 (preferred) and/or HIPAA-aligned operational controls.

  • Proven incident leadership: on-call maturity, clear runbooks, effective postmortems.

  • Hands-on experience operating Postgres in production.

Nice to have

  • Experience implementing Kubernetes security best practices (network policies, admission control, policy-as-code, supply chain security).

  • Familiarity with compliance/security frameworks (NIST/ISO-style controls), vendor risk, and audit coordination.

  • Experience with Vercel/Next.js operational performance tuning.

Working model

  • This is an in-office role in Lehi, Utah, partnering closely with engineering leadership to embed security, compliance, and reliability into how we build.