Site Reliability Engineer
CardioOne
See how well this job matches your profile
Sign up to get an AI match score and generate a tailored application in seconds.
Get your match scoreTags
About the role
Role Overview
CardioOne is hiring a Site Reliability Engineer (SRE) to ensure the reliability, scalability, security, and performance of production systems. The SRE will bridge software development and operations by implementing automation, monitoring, and best practices to enable rapid, reliable delivery of applications.
You will report directly to the Senior Director of Engineering.
Responsibilities
Reliability & Performance
- Ensure high availability, scalability, and performance of production systems
- Implement and maintain SLIs, SLOs, and SLAs for critical services
- Perform capacity planning and performance tuning
Automation & Tooling
- Automate infrastructure provisioning using Terraform/Terragrunt and Ansible
- Build automation to reduce manual operations and improve deployment workflows
- Create CI/CD pipelines for rapid, reliable deployments
Monitoring & Incident Response
- Design and maintain monitoring, logging, and alerting systems (Datadog)
- Participate in on-call rotations and lead incident response
- Conduct root-cause analysis and write postmortems to prevent recurrence
Systems Engineering
- Manage cloud infrastructure on AWS and Azure
- Work with container orchestration platforms: Kubernetes and ECS
- Optimize architectures for reliability and fault tolerance
- Apply security, networking, and service resilience best practices
Collaboration & Leadership
- Partner with development teams to design reliable microservices and distributed systems
- Advocate for SRE principles and drive operational excellence
- Mentor engineers on reliability practices, tooling, and automation
Requirements
- Bachelor’s degree in CS/Engineering or equivalent experience
- 3–7 years in SRE, DevOps, or Systems Engineering
- Strong proficiency with Linux and shell scripting
- Cloud experience: AWS and Azure
- Hands-on container experience: Kubernetes/ECS and Docker
- Programming: Python or Java
- Experience with CI/CD pipelines and DevOps tooling
- Strong understanding of distributed systems, networking, and security
Preferred Qualifications
- Observability stacks using OpenTelemetry
- Database management: PostgreSQL
- Configuration management: Ansible, Chef, Puppet
- Knowledge of zero-downtime deployments and chaos engineering
Soft Skills
- Strong analytical/problem-solving skills
- Excellent communication and cross-team collaboration
- Ability to thrive in fast-paced, high-stakes environments
- Continuous improvement and operational excellence mindset
Work Location
Remote (Colorado or Delaware/Florida/New Hampshire/New Jersey/New York/Pennsylvania/Texas)
About CardioOne
CardioOne partners with independent cardiologists to deliver innovative solutions that improve patient outcomes while reducing costs. Its platform supports physician partners in succeeding in today’s fee-for-service environment and preparing for value-based care, backed by investment from WindRose Health Investors.
Scraped 4/7/2026