Senior DevOps Engineer/Site Reliability Engineer
Stellar Cyber
full-remoteseniorpermanentdevopsbackend Full remote 6 days ago via WTTJ
See how well this job matches your profile
Sign up to get an AI match score and generate a tailored application in seconds.
Get your match scoreTags
KubernetesTerraformHelmCI/CDObservabilityPrometheusGrafanaLokiAlertingIncident Management
About the role
Role overview
Senior DevOps / Site Reliability Engineer (full remote). You will build, operate, and scale reliable cloud-native infrastructure and distributed data platforms, partnering with platform, development, and operations teams to improve automation and reliability practices.
Key missions
- Administer and maintain Kubernetes clusters and containerized workloads
- Manage cloud infrastructure across OCI, AWS, GCP, or Azure environments
- Develop and maintain CI/CD pipelines for reliable application deployments
- Implement and manage Infrastructure as Code (IaC) using Terraform and Helm
- Drive observability improvements across monitoring, logging, tracing, and alerting
- Monitor, troubleshoot, and resolve production incidents, applying incident management and reliability engineering practices
Requirements
- 5+ years in DevOps, SRE, or Platform Engineering
- Strong expertise in Kubernetes, Docker, and container orchestration
- Hands-on experience managing production cloud environments and high-availability systems
- Experience with CI/CD and deployment automation
- Strong skills in Linux, networking, and distributed systems troubleshooting
- Proficiency in Python, Bash, or Go for scripting/programming
- Observability experience with Prometheus, Grafana, Loki, Alertmanager, Elastic Stack
- Strong IaC experience with Terraform (and Helm)
- East Coast residency (location/timezone constraint)
Nice-to-haves / related experience
- Familiarity with AI-driven operational tooling and automated remediation concepts
- Data platform experience with Kafka, Spark, Elasticsearch, Redis, MongoDB
- Experience supporting on-call operations
About Stellar Cyber
Stellar Cyber is a globally distributed engineering organization focused on building and operating reliable cloud-native infrastructure and distributed data platforms. The role emphasizes DevOps/SRE practices such as automation, observability, and operational excellence for mission-critical systems.
Scraped 6/11/2026