Site Reliability Engineer
BayOne Solutions
full-remotemidpermanentdevopsbackend United States 4 days ago via LinkedIn
See how well this job matches your profile
Sign up to get an AI match score and generate a tailored application in seconds.
Get your match scoreTags
Site Reliability EngineeringIncident ResponseSQLObservabilityDistributed TracingStructured LoggingKubernetesGCPPythonData Analysis
About the role
Role Overview
Site Reliability Engineer (100% Remote — US Local Only). You will help clients improve system stability and reliability by building tooling and a reliability culture informed by incident response data.
Responsibilities
- Participate in incident response and help drive continuous improvements in production stability
- Analyze raw incident logs to derive actionable reliability strategies
- Build tooling that supports the intersection of systems engineering and data science
- Improve observability across alerting, tracing, structured logging, and metrics
Requirements
- Experience: 4+ years in SRE, DevOps, or Systems Engineering roles managing production environments at scale
- Data: Strong SQL and data analysis skills
- Programming: Expertise in one or more of Golang, Java, Python, or C++
- Observability: Deep understanding of alerting systems, distributed tracing, structured logging, and metrics collection
- Systems & Cloud: Experience with Kubernetes and GCP infrastructure
Nice-to-Haves
- Not specified
About BayOne Solutions
BayOne Solutions is a technology services company that helps clients improve the stability and reliability of their production systems. The role focuses on applying systems engineering and data analysis to incident response and observability, turning raw logs into actionable reliability strategies.
Scraped 6/14/2026