xelys jobs xelys jobs

Principal DevOps Engineer

Zeta Global

full-remotearchitectpermanentdevopssecurity Full remote Yesterday via WTTJ

See how well this job matches your profile

Sign up to get an AI match score and generate a tailored application in seconds.

Get your match score

Tags

AWSEKSKubernetesTerraformCI/CDGitLab CI/CDPrometheusDockerApache KafkaSite Reliability Engineering (SRE)

About the role

Role overview

Join Zeta Global as a Principal DevOps Engineer to architect and operate CI/CD and SRE capabilities that enable developers across multiple teams to ship to production safely and frequently. This is a leadership role focused on production-grade automation, reliability, and influencing engineering standards in a regulated environment.

Responsibilities

  • Architect and operate CI/CD to production, enabling concurrent deployments across multiple teams.
  • Design and run production-grade pipelines with advanced deployment strategies (e.g., canary, blue/green, progressive delivery, feature flag integration).
  • Implement deployment observability (metrics, logs, tracing) to support safer releases.
  • Serve as an SRE leader, including incident response and operational leadership.
  • Lead incident response: on-call rotations, runbook development, blameless postmortems, and incident command structure.
  • Manage AWS infrastructure and drive infrastructure architecture decisions using reference architectures and internal standards.
  • Develop and socialize best practices to influence engineering culture and adoption of new operational/engineering practices.

Requirements

  • Docker expertise (multi-stage builds, security hardening, image optimization, container runtime management).
  • Ability to work across multiple stacks, including Node.js, React, Python, Java, and Ruby (to understand build systems and runtime characteristics).
  • Deep CI/CD and deployment-strategy expertise at scale (canary/blue-green/progressive delivery/feature flags).
  • Kafka production experience (cluster management, topic design, consumer group strategies, operational monitoring for high-throughput streaming).
  • Strong networking fundamentals: DNS (Route 53), TCP/IP, routing, API Gateway, load balancing (ALB/NLB), service mesh, VPC peering, transit gateways, and troubleshooting.
  • Observability tooling: Grafana, Prometheus (PromQL), Loki, and Honeycomb (distributed tracing).
  • Infrastructure as Code with Terraform (modules, state management, multi-environment orchestration).
  • Extensive AWS production experience: EKS, EC2, SQS, DynamoDB, IAM, VPC, CloudWatch.
  • 10+ years in DevOps/SRE/Platform/Infrastructure engineering, with impact at staff/principal level.
  • Experience in regulated environments with practical knowledge of GDPR, CCPA, SOC 2, and compliance automation in MarTech/AdTech.
  • Expert Kubernetes: cluster administration, Helm, custom controllers/operators, network policies, RBAC, and multi-cluster management on AWS EKS.
  • GitLab CI/CD experience, including advanced features (parent-child pipelines, dynamic environments, security scanning integration).
  • AWS certifications (Solutions Architect Professional, DevOps Engineer Professional, or Security Specialty).
  • Feature flag/progressive delivery tooling: Statsig (or similar) and experimentation/A-B testing.
  • Familiarity with FinOps and cloud cost optimization.
  • Background with chaos engineering tools/practices (Gremlin, Litmus, Chaos Monkey).

Nice-to-haves

  • Chaos engineering experience and resilience validation.
  • Experience building systems in adtech/martech compliance contexts.

Location / work mode

  • Full remote.

About Zeta Global

Zeta Global is a technology company operating in the adtech/martech ecosystem, building and running software at global scale. The role focuses on modern DevOps/SRE practices to enable safe and frequent deployments in a regulated, compliance-driven environment.

Scraped 5/14/2026

xelys jobs xelys jobs

Built for remote job seekers. Powered by AI.