xelys jobs xelys jobs

Senior Site Reliability Engineer

Cloudbeds

full-remoteseniorpermanentdevopsbackend United States Today via LinkedIn
145,000 - 165,000 USD/annual

See how well this job matches your profile

Sign up to get an AI match score and generate a tailored application in seconds.

Get your match score

Tags

AWSSite Reliability Engineering (SRE)KubernetesEKSTerraformGitOpsArgoCDObservabilityGrafanaIncident Management

About the role

Role Overview

As a Senior Site Reliability Engineer (SRE), you’ll ensure Cloudbeds’ hospitality platform remains reliable and performant, supporting millions of transactions globally. You’ll architect and operate scalable cloud infrastructure, strengthen automation and resilience, and drive continuous improvement across engineering teams.

Responsibilities

  • Design and implement reliable, scalable AWS architecture for organizational needs.
  • Operate and maintain Kubernetes (EKS) clusters and related infrastructure components.
  • Support CI/CD using ArgoCD and GitOps.
  • Automate deployments with Terraform (IaC).
  • Build and improve observability and monitoring using Grafana, Prometheus, Datadog, and CloudWatch.
  • Participate in incident management and perform root cause analysis (RCA) to minimize service impact.
  • Optimize performance and troubleshoot production issues.
  • Collaborate with development teams on monitoring best practices and reliability targets.
  • Partner with security teams to implement and maintain security best practices.
  • Provide guidance via infrastructure support rotation.

Requirements

  • 5+ years of experience as a DevOps/SRE in the AWS ecosystem.
  • 5+ years with Kubernetes (EKS) and Helm.
  • Experience designing/building/supporting CI/CD pipelines with ArgoCD and GitHub Actions.
  • Terraform IaC experience.
  • Observability/monitoring experience with Grafana, Prometheus, Datadog, and CloudWatch.
  • Incident management, full-stack troubleshooting, performance analysis, and RCA.
  • Web application systems experience: Nginx, Ingress controllers, load balancing, CDNs.
  • Database and middleware experience: MySQL, PostgreSQL, Aurora; Redis, Memcached, SQS.
  • Networking skills: VPC, Security Groups, Network ACLs.
  • Ability to work remotely and manage time in a global team.
  • English communication skills (written and verbal).
  • Bachelor’s degree in CS or equivalent experience.

Bonus Skills

  • Advanced database administration (Aurora, MySQL, PostgreSQL).
  • Experience in a PCI-compliant environment.
  • Experience with Kong API Gateway.

About Cloudbeds

Cloudbeds builds an AI-powered hospitality platform (hotel PMS) used by properties across 150 countries. The company helps hoteliers modernize operations and commercial strategy via a unified system that integrates with many partners, and operates with a fully remote team.

Scraped 5/14/2026

xelys jobs xelys jobs

Built for remote job seekers. Powered by AI.