xelys jobs xelys jobs

Senior Site Reliability Engineer

Cloudbeds

full-remoteseniorpermanentdevops United States 3 days ago via LinkedIn
145,000 - 165,000 USD/annual

See how well this job matches your profile

Sign up to get an AI match score and generate a tailored application in seconds.

Get your match score

Tags

AWSKubernetesTerraformArgoCDGitOpsGrafanaPrometheusDatadogIncident ManagementRoot Cause Analysis

About the role

Role Overview

As a Senior Site Reliability Engineer, you will be responsible for the reliability and performance of Cloudbeds’ global hospitality platform. You will help ensure high availability for millions of transactions worldwide by designing scalable AWS infrastructure, improving observability, and leading incident response and continuous improvement.

Responsibilities

  • Design and implement reliable, scalable AWS architectures.
  • Maintain and support highly loaded Kubernetes (EKS) clusters and related infrastructure components.
  • Improve CI/CD reliability using ArgoCD and GitOps.
  • Automate deployments with Terraform (IaC).
  • Build and continuously enhance observability and monitoring using:
    • Grafana, Prometheus, Datadog, CloudWatch
  • Participate in incident management and perform root cause analysis (RCA) to minimize service impact.
  • Troubleshoot and optimize system performance.
  • Partner with development teams to define and enforce monitoring best practices and meet reliability targets.
  • Collaborate with security teams to implement and maintain security best practices.
  • Join an infrastructure support rotation to guide other engineering teams.

Requirements

  • 5+ years of experience as DevOps or SRE in the AWS ecosystem.
  • 5+ years with Kubernetes (EKS) and Helm.
  • Experience building CI/CD pipelines using ArgoCD and GitHub Actions.
  • Experience with Terraform infrastructure-as-code.
  • Observability/monitoring experience with Grafana, Prometheus, Datadog, and CloudWatch.
  • Incident management experience, strong full-stack troubleshooting, performance analysis, and RCA.
  • Web systems experience including Nginx, Ingress controllers, load balancing, and CDNs.
  • Databases: MySQL, PostgreSQL, Aurora; Middleware: Redis, Memcached, SQS.
  • Networking fundamentals: VPC, Security Groups, Network ACLs.
  • Ability to work remotely and manage time in a global team; strong English communication.
  • Bachelor’s degree in Computer Science or equivalent experience.

Bonus Skills

  • Advanced database administration (Aurora, MySQL, PostgreSQL).
  • Experience in PCI-compliant environments.
  • Experience with Kong API Gateway.

About Cloudbeds

Cloudbeds builds a hospitality platform (hotel PMS) that powers properties across 150+ countries and processes billions of bookings annually. The company enables hoteliers to run operations and commercial strategy through a unified, partner-integrated platform, with a fully remote engineering team. It was founded in 2012 and has been recognized for technology growth (e.g., Deloitte Technology Fast 500).

Scraped 4/16/2026

xelys jobs xelys jobs

Built for remote job seekers. Powered by AI.