Principal DevOps Engineer
Zeta Global
full-remotearchitectpermanentdevopssecurity Full remote Yesterday via WTTJ
See how well this job matches your profile
Sign up to get an AI match score and generate a tailored application in seconds.
Get your match scoreTags
AWSEKSKubernetesTerraformCI/CDGitLab CI/CDPrometheusDockerApache KafkaSite Reliability Engineering (SRE)
About the role
Role overview
Join Zeta Global as a Principal DevOps Engineer to architect and operate CI/CD and SRE capabilities that enable developers across multiple teams to ship to production safely and frequently. This is a leadership role focused on production-grade automation, reliability, and influencing engineering standards in a regulated environment.
Responsibilities
- Architect and operate CI/CD to production, enabling concurrent deployments across multiple teams.
- Design and run production-grade pipelines with advanced deployment strategies (e.g., canary, blue/green, progressive delivery, feature flag integration).
- Implement deployment observability (metrics, logs, tracing) to support safer releases.
- Serve as an SRE leader, including incident response and operational leadership.
- Lead incident response: on-call rotations, runbook development, blameless postmortems, and incident command structure.
- Manage AWS infrastructure and drive infrastructure architecture decisions using reference architectures and internal standards.
- Develop and socialize best practices to influence engineering culture and adoption of new operational/engineering practices.
Requirements
- Docker expertise (multi-stage builds, security hardening, image optimization, container runtime management).
- Ability to work across multiple stacks, including Node.js, React, Python, Java, and Ruby (to understand build systems and runtime characteristics).
- Deep CI/CD and deployment-strategy expertise at scale (canary/blue-green/progressive delivery/feature flags).
- Kafka production experience (cluster management, topic design, consumer group strategies, operational monitoring for high-throughput streaming).
- Strong networking fundamentals: DNS (Route 53), TCP/IP, routing, API Gateway, load balancing (ALB/NLB), service mesh, VPC peering, transit gateways, and troubleshooting.
- Observability tooling: Grafana, Prometheus (PromQL), Loki, and Honeycomb (distributed tracing).
- Infrastructure as Code with Terraform (modules, state management, multi-environment orchestration).
- Extensive AWS production experience: EKS, EC2, SQS, DynamoDB, IAM, VPC, CloudWatch.
- 10+ years in DevOps/SRE/Platform/Infrastructure engineering, with impact at staff/principal level.
- Experience in regulated environments with practical knowledge of GDPR, CCPA, SOC 2, and compliance automation in MarTech/AdTech.
- Expert Kubernetes: cluster administration, Helm, custom controllers/operators, network policies, RBAC, and multi-cluster management on AWS EKS.
- GitLab CI/CD experience, including advanced features (parent-child pipelines, dynamic environments, security scanning integration).
- AWS certifications (Solutions Architect Professional, DevOps Engineer Professional, or Security Specialty).
- Feature flag/progressive delivery tooling: Statsig (or similar) and experimentation/A-B testing.
- Familiarity with FinOps and cloud cost optimization.
- Background with chaos engineering tools/practices (Gremlin, Litmus, Chaos Monkey).
Nice-to-haves
- Chaos engineering experience and resilience validation.
- Experience building systems in adtech/martech compliance contexts.
Location / work mode
- Full remote.
About Zeta Global
Zeta Global is a technology company operating in the adtech/martech ecosystem, building and running software at global scale. The role focuses on modern DevOps/SRE practices to enable safe and frequent deployments in a regulated, compliance-driven environment.
Scraped 5/14/2026