Senior Site Reliability Engineer
Cloudbeds
See how well this job matches your profile
Sign up to get an AI match score and generate a tailored application in seconds.
Get your match scoreTags
About the role
Role Overview
As a Senior Site Reliability Engineer (SRE) at Cloudbeds, you will be responsible for the reliability and performance of the company’s hospitality platform. You’ll architect and operate scalable cloud infrastructure to ensure transactions run smoothly globally, while driving automation, resilience, and continuous improvement.
Responsibilities
- Design and implement reliable, scalable AWS architecture for organizational needs.
- Maintain and support Kubernetes (EKS) clusters and related infrastructure components.
- Support CI/CD using ArgoCD and GitOps.
- Automate deployments with Terraform (Infrastructure as Code).
- Build and continuously improve observability and monitoring using:
- Grafana, Prometheus, Datadog, CloudWatch
- Participate in incident management and perform root cause analysis (RCA) to minimize service impact.
- Optimize performance, troubleshoot issues, and improve reliability outcomes.
- Collaborate with development teams on monitoring best practices and reliability targets.
- Collaborate with security teams to implement and maintain security best practices.
- Contribute to an infrastructure support rotation and provide guidance to other teams.
Requirements
- 5+ years of experience in DevOps or SRE in the AWS ecosystem.
- 5+ years with Kubernetes (EKS) and Helm.
- Experience designing/building/supporting CI/CD pipelines with ArgoCD and GitHub Actions.
- Experience with Terraform and Infrastructure-as-Code.
- Observability/monitoring experience with Grafana, Prometheus, Datadog, CloudWatch.
- Incident management, full-stack troubleshooting, performance analysis, and RCA.
- Experience with web application systems such as Nginx, Ingress controllers, load balancing, and CDNs.
- Database and middleware experience with:
- MySQL, PostgreSQL, Aurora
- Redis, Memcached, SQS
- Strong networking skills: VPC, Security Groups, Network ACLs.
- Ability to work remotely and manage your own time across a global team.
- Strong English written/verbal communication.
- Bachelor’s degree in Computer Science or equivalent experience.
Bonus Skills
- Advanced database administration experience (Aurora, MySQL, PostgreSQL).
- Experience in PCI-compliant environments.
- Experience with Kong API Gateway.
About Cloudbeds
Cloudbeds builds an intelligent hospitality software platform (hotel PMS) used by properties across 150 countries. The company supports independent properties and hotel groups with a unified system that integrates with hundreds of partners, and operates with a fully remote team.
Scraped 5/14/2026