Senior Site Reliability Engineer
Jobgether
full-remoteseniorpermanentdevopssecurity United States 2 days ago via LinkedIn
See how well this job matches your profile
Sign up to get an AI match score and generate a tailored application in seconds.
Get your match scoreTags
Site Reliability EngineeringAWSObservabilitySplunkDatadogServiceNowIncident ResponseAlertingRunbooksFinancial Services
About the role
Role Overview
Senior Site Reliability Engineer responsible for production reliability and cloud operations in a highly regulated financial services environment. You will focus on stability, observability, performance, and improving incident response by turning operational chaos into scalable processes.
Accountabilities
- Own and improve production reliability across large-scale distributed systems, ensuring high availability and performance
- Design, refine, and maintain observability and monitoring using tools such as Splunk, Datadog, and ServiceNow
- Reduce alert noise/alert fatigue by improving signal quality, eliminating false positives, and strengthening severity classification and escalation paths
- Develop and maintain incident response playbooks, troubleshooting procedures, mitigation steps, and post-incident reviews
- Troubleshoot complex AWS-based production issues and drive rapid identification and resolution
- Collaborate with engineering, infrastructure, and product teams to improve reliability, scalability, and operational efficiency
- Increase operational maturity through automation and observability improvements for production support
Requirements
- Extensive experience in Site Reliability Engineering, production support, or infrastructure engineering
- Strong expertise in AWS and cloud-native architectures
- Proven observability experience with Splunk and Datadog (or similar)
- Demonstrated ability to improve signal-to-noise via effective alerting
- Experience creating incident response playbooks, severity frameworks, and runbooks
- Strong troubleshooting skills in complex distributed/production systems
- Excellent analytical and communication skills to coordinate across technical and non-technical stakeholders
Nice to Have
- Experience in regulated industries such as financial services, banking, or payments
About Jobgether
The job posting is listed on behalf of Jobgether, which uses an AI-powered matching process to connect candidates with partner companies. The hiring partner operates large-scale, highly regulated financial services systems, including modern banking and payments platforms.
Scraped 6/17/2026