xelys jobs xelys jobs

Senior Site Reliability Engineer

Cloudbeds

full-remoteseniorpermanentdevopsbackend United States 2 days ago via LinkedIn
145,000 - 165,000 USD/annual

See how well this job matches your profile

Sign up to get an AI match score and generate a tailored application in seconds.

Get your match score

Tags

Site Reliability Engineering (SRE)AWSKubernetesEKSTerraformArgoCDGitOpsObservabilityPrometheusDatadog

About the role

Role Overview

As a Senior Site Reliability Engineer (SRE) at Cloudbeds, you will be responsible for the reliability and performance of the platform that powers hospitality transactions globally. You’ll architect and implement scalable AWS solutions, strengthen automation and resilience across engineering teams, and continuously improve observability and incident response.

Responsibilities

  • Design and implement reliable, scalable AWS architecture for the organization.
  • Maintain and support high-load Kubernetes (EKS) clusters and related infrastructure components.
  • Support CI/CD using ArgoCD and GitOps.
  • Automate deployments with Terraform (Infrastructure as Code).
  • Build and continuously improve Observability & Monitoring, leveraging:
    • Grafana, Prometheus, Datadog, and CloudWatch.
  • Participate in Incident Management and Root Cause Analysis (RCA) to minimize impact.
  • Optimize system performance and perform full-stack troubleshooting.
  • Collaborate with development teams on monitoring best practices and reliability targets.
  • Partner with security teams to implement and maintain security best practices.
  • Contribute via infrastructure support rotation (guidance to other engineering teams).

Requirements

  • 5+ years experience as DevOps or SRE in the AWS ecosystem.
  • 5+ years with Kubernetes (EKS) and Helm.
  • Experience designing/building CI/CD pipelines with ArgoCD and GitHub Actions.
  • Terraform for Infrastructure-as-Code.
  • Observability/monitoring experience with Grafana, Prometheus, Datadog, and CloudWatch.
  • Incident management and strong troubleshooting, performance analysis, and RCA.
  • Experience with web application systems: Nginx, Ingress controllers, load balancing, and CDNs.
  • Database and middleware experience: MySQL, PostgreSQL, Aurora, plus Redis, Memcached, SQS.
  • Networking knowledge: VPC, Security Groups, Network ACLs.
  • Ability to work remotely and manage time with a global team; English communication skills.
  • Bachelor’s degree in Computer Science or equivalent experience.

Bonus Skills

  • Advanced database administration (Aurora, MySQL, PostgreSQL).
  • Experience in a PCI-compliant environment.
  • Experience with Kong API Gateway.

About Cloudbeds

Cloudbeds builds a cloud-based property management system (PMS) for hospitality, serving properties across 150 countries and processing billions of bookings annually. Its unified platform integrates with hundreds of partners to help hoteliers improve operations and commercial strategy. The company operates with a fully remote team and focuses on reliability, scalability, and increasingly AI-powered solutions.

Scraped 4/15/2026

xelys jobs xelys jobs

Built for remote job seekers. Powered by AI.