Senior Site Reliability Engineer
Autodesk
seniorpermanentdevopsother Idaho, United States 2 days ago via LinkedIn
See how well this job matches your profile
Sign up to get an AI match score and generate a tailored application in seconds.
Get your match scoreTags
Site Reliability Engineering (SRE)Cloud InfrastructureSLOs/SLIsObservabilityIncident ManagementAutomationResilience TestingFedRAMPRunbooksOn-call
About the role
Role Overview
Senior Site Reliability Engineer at Autodesk, joining a new SRE team supporting Autodesk GovCloud. You will help shape how production services are deployed, operated, and improved in restricted cloud environments, establishing the operating model, reliability practices, automation, and engineering standards.
Responsibilities
- Own reliability, availability, performance, operability, and capacity for one or more production services
- Deploy, operate, maintain, and continuously improve production services in Autodesk GovCloud environments
- Partner with engineering and cross-functional teams (product engineering, security, compliance, platform, infrastructure) to ensure services are designed for reliability, scalability, security, and operability
- Define and run reliability practices such as SLOs/SLIs, error budgets, production readiness reviews, service reviews, and operational health reviews
- Build automation to improve deployment safety, operational efficiency, incident response, and service recovery
- Develop software, automation, and tooling that improve reliability, scalability, and efficiency
- Improve monitoring, alerting, logging, tracing, and overall observability
- Lead/participate in incident response, troubleshooting, and post-incident reviews (continuous learning)
- Maintain operational documentation, runbooks, and recovery procedures
- Scale resilience testing and “Gameday” practices to validate system behavior and recovery capabilities
- Reduce operational toil through software engineering, automation, and process improvements
- Ensure services remain compliant with Autodesk security/privacy/regulatory requirements (including FedRAMP where applicable)
- Participate in a 24x7 on-call rotation for production services
- Help mature operational excellence practices in a fast-paced environment
Minimum Qualifications
- 7+ years experience in Site Reliability Engineering, Software Engineering, Platform Engineering, Cloud Infrastructure, or Production Operations
- Experience operating and supporting customer-facing production services in large-scale cloud environments
- Strong reliability engineering foundation (notably SLOs/SLIs, observability, incident management/operations)
- B.S. or higher in Computer Science/Engineering (or equivalent practical experience)
Additional / Eligibility Requirement
- Must be a U.S. Citizen and meet government security and eligibility requirements (background investigations and government-issued security clearances).
About Autodesk
Autodesk is a software company focused on building tools and cloud services for customers. In this role, Autodesk is specifically supporting its GovCloud products, operating reliable, secure, and scalable services in restricted cloud environments.
Scraped 6/20/2026