Senior Site Reliability Engineer / Platform Engineer
SimScale
See how well this job matches your profile
Sign up to get an AI match score and generate a tailored application in seconds.
Get your match scoreTags
About the role
Role Overview
Join SimScale as a Senior Site Reliability Engineer / Platform Engineer. This is a hands-on role where you will own and improve the organization’s cloud infrastructure, expand observability across teams, and help shape multi-region architecture.
Key Responsibilities
- Own and improve cloud infrastructure across areas including AWS + EKS, observability, disaster recovery, and security/compliance controls.
- Build standards, guardrails, and self-service tooling so engineering teams can safely run workloads on AWS.
- Drive organization-wide adoption of OpenTelemetry for distributed tracing and metrics.
- Help teams define meaningful SLOs/SLIs and improve reliability based on that data.
- Collaborate with a small infrastructure team supporting 50+ engineers.
Requirements
- Strong foundation in Linux internals and distributed systems to debug production behavior.
- Software development background and ability to write production-quality code in at least one of:
- Python, Go, Rust, or Java
- Security and compliance awareness (e.g., impact on access control, auditability, disaster recovery, logging, and SOC 2).
- Deep experience in production incident debugging, clear incident communication, and converting findings into durable improvements.
- Hands-on cloud/platform experience including:
- AWS (or GCP)
- Terraform (declarative infrastructure)
- Argo CD (GitOps workflow)
- Kubernetes (container orchestration)
- 5+ years professional experience in SRE, platform, or infrastructure engineering.
- Clear communication and ability to explain trade-offs and enable adoption without unnecessary friction.
- Observability/reliability experience with:
- OpenTelemetry
- Prometheus
- distributed tracing
- monitoring and SLOs/SLIs
- Open source portfolio or contributions.
Nice to Have
- Prior technical leadership experience, especially in infrastructure, reliability, or platform engineering.
Benefits (Highlights)
Mobile working, competitive health benefits, discounted gym membership, flexible hours, learning & development opportunities, child care contributions, and a retirement plan.
About SimScale
SimScale is a browser-based simulation platform company focused on providing simulation capabilities through web technology. The role supports engineering reliability and platform infrastructure that underpins large-scale, multi-region cloud workloads.
Scraped 6/20/2026